from the editor Editor in Chief: Steve McConnell
■
Construx Software
■
[email protected]
Closing the Gap Steve McConnell
Y
ears ago, Fred Brooks commented, “The gap between the best software engineering practice and the average practice is very wide—perhaps wider than in any other engineering discipline.” The past few years have seen a proliferation of books on project management, requirements, architecture, design, testing—nearly every area of software engineering. But within the companies I visit in my consulting business, I rarely see software engineering best practices being used. Increasingly, I ask myself, “Why aren’t people using the numerous good software engineering practices that are now so readily available?”
Classic barriers to innovation A conventional answer to this question is that many of these practices simply aren’t yet mature. When presented with a new practice, software practitioners tend to ask tough questions such as these:1 ■
■
■
■
■
Do experimental results prove conclusively that the practice will work in the field? Are successes a result of the practice itself, or might they be the result of the people using it? Is the practice complete, or does it need to be adapted or extended before it can be applied? Does the practice have significant overhead (training, documentation) that offsets its value in the long run? If the practice was developed in a research setting, does it apply to real-
Copyright © 2002 Steven C. McConnell. All Rights Reserved.
■ ■ ■ ■
■
world problems? Does the practice generally slow down the programmers? Can the practice be misapplied? Is information available about the risks involved with using the practice? Does the practice include information about how to integrate it with existing practices? Must the practice be applied in its entirety to realize significant benefits?
These are all fair questions, and I think it’s healthy for practitioners to ask them. Indeed, part of IEEE Software’s mission is to help our readers answer these questions. However, the practices I’m thinking of are hardly new, and, for many of them, I believe many of these questions have already been answered. Table 1 lists numerous practices that leading organizations have understood well and deployed for decades. In the management arena, we’ve had automated estimation tools since the early 1970s, but most projects don’t use them. Measurement has been a key topic for 25 years, but few organizations collect quantitative data on their projects. I still see software developers housed in open work bays or cubicles far more often than I see them working in private or semiprivate offices— even though research about the effect of physical environment on productivity has been conclusive for more than 15 years. One of the most fundamental practices in software engineering is change control, especially as it relates to software requirements. I teach a two-day workshop based on my book Software Project Survival Guide (Microsoft January/February 2002
IEEE SOFTWARE
3
FROM THE EDITOR
Table 1 Year of introduction of rarely used software best practices
D E PA R T M E N T E D I T O R S
Bookshelf: Warren Keuffel,
[email protected] Construction: Andy Hunt and Dave Thomas, Pragmatic Programmers, {Andy, Dave}@pragmaticprogrammer.com
Best practice
Year first described in print or first available commercially
Project planning and management practices Automated estimation tools Evolutionary delivery Measurement Productivity environments Risk management planning
1973 1988 1977 1984 1981
Requirements engineering practices Change board Throwaway user interface prototyping Joint Application Design
1978 1975 1985
Design practices Information hiding Design for change
1972 1979
Construction practices Source code control Incremental integration
1980 1979
Pauline Hosillos
Quality assurance practices Branch-coverage testing Inspections
1979 1976
Art Director Toni Van Buskirk
Process improvement
Country Report: Deependra Moitra, Lucent Technologies
[email protected] Design: Martin Fowler, ThoughtWorks,
[email protected] Loyal Opposition: Robert Glass, Computing Trends,
[email protected] Manager: Don Reifer, Reifer Consultants,
[email protected] Quality Time: Jeffrey Voas, Cigital,
[email protected] STAFF
Senior Lead Editor Dale C. Strok
[email protected] Group Managing Editor Crystal Chweh Associate Editors Jenny Ferrero and Dennis Taylor Staff Editors Shani Murray, Scott L. Andresen, and Kathy Clark-Fisher Magazine Assistants Dawn Craig
[email protected]
Cover Illustration Dirk Hagner Technical Illustrator Alex Torres
Software Engineering Institute’s Software Capability Maturity Model
1987
Software Engineering Process Groups
1989
Production Artist Carmen Flores-Garvey Executive Director David Hennage Publisher Angela Burgess Assistant Publisher Dick Price Membership/Circulation Marketing Manager Georgann Carter Advertising Assistant Debbie Sims CONTRIBUTING EDITORS
Greg Goth, Denise Hurst, Gil Shif, Keri Schreiner, and Margaret Weatherford
Editorial: All submissions are subject to editing for clarity, style, and space. Unless otherwise stated, bylined articles and departments, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in IEEE Software does not necessarily constitute endorsement by the IEEE or the IEEE Computer Society. To Submit: Send 2 electronic versions (1 word-processed and 1 postscript or PDF) of articles to Magazine Assistant, IEEE Software, 10662 Los Vaqueros Circle, PO Box 3014, Los Alamitos, CA 90720-1314;
[email protected]. Articles must be original and not exceed 5,400 words including figures and tables, which count for 200 words each.
4
IEEE SOFTWARE
January/February 2002
Press, 1998). When I originally developed the workshop, I included a module on change control, because I could easily pull together the necessary materials and I was working under some deadline pressure. I assumed that it would be too basic for most of my students and that I would need to replace that module as soon as I had time. To my surprise, three years later, after teaching the class about 20 times, I’ve had only one group of students in which more than half were already using change control. Change control has been described in the software engineering literature since 1978, but the basic practice has been employed in other branches of engineering for at least 50 years. All the tough questions listed earlier were answered for change control decades ago. Considering the
practice’s central role in software project control, I am puzzled about why software projects don’t use this fundamental practice universally. Barriers to software innovations Software presents unique challenges to adopting better practices. One challenge is a lack of awareness that good practices exist. Where, ideally, should someone learn about fundamental software engineering practices? In most fields, we expect universities to provide education in the fundamentals. Until very recently, however, most undergraduate degree programs related to computer programming have not including training in these basic practices. Additional university programs are coming online
FROM THE EDITOR
each year, and I think the lack of infrastructure is due simply to software engineering’s being such a young field. In the absence of university education systems, we might expect software-producing companies themselves to provide supplemental training. In fact, a few leading companies do train their software engineers, but not to an extent great enough to ameliorate industry-wide software problems. In less advanced companies, the lack of training has been more difficult to address. Before a manager can prescribe training, he needs to know that a field of knowledge is deep enough to need training. Managers who came up through the technical ranks 20 years ago, or even 10 years ago, might underestimate the depth of knowledge in modern software engineering. Many software managers are not themselves well trained enough to realize that their staff needs training.
open a basic question: Why don’t software engineers—who are some of the brighter people on the planet— seek out better methods of doing their work? We’re all aware of the pain arising from not using these practices. So why don’t practitioners more actively seek them out and use them? With all the advances during the past several years, it appears that the challenge for the software industry has shifted from good-practice development to good-practice deployment.
Calling all experts These are all descriptions of what has not been done, but they still leave
Reference
EDITOR IN CHIEF: Steve McConnell 10662 Los Vaqueros Circle Los Alamitos, CA 90720-1314
[email protected] EDITOR IN CHIEF EMERITUS: Alan M. Davis, Omni-Vista A S S O C I AT E E D I T O R S I N C H I E F
Design: Maarten Boasson, Quaerendo Invenietis
[email protected] Construction: Terry Bollinger, Mitre Corp.
[email protected] Requirements: Christof Ebert, Alcatel Telecom
[email protected] Management: Ann Miller, University of Missouri, Rolla
[email protected] Quality: Jeffrey Voas, Cigital
[email protected] Experience Reports: Wolfgang Strigel, Software Productivity Center;
[email protected] EDITORIAL BOARD
W
hat do you see as the barriers to deployment of good practices? How do you think good practices can be deployed more quickly? I invite your comments.
Don Bagert, Texas Tech University Richard Fairley, Oregon Graduate Institute Martin Fowler, ThoughtWorks Robert Glass, Computing Trends Andy Hunt, Pragmatic Programmers Warren Keuffel, independent consultant Brian Lawrence, Coyote Valley Software Karen Mackey, Cisco Systems Deependra Moitra, Lucent Technologies, India Don Reifer, Reifer Consultants Suzanne Robertson, Atlantic Systems Guild Dave Thomas, Pragmatic Programmers Karl Wiegers, Process Impact INDUSTRY ADVISORY BOARD
1.
S.A. Raghavan and D.R. Chand, “Diffusing Software-Engineering Methods,” IEEE Software, vol. 6, no. 4, July 1989, pp. 81–90.
Andy Hunt and Dave Thomas Join IEEE Software Editorial Board Andy Hunt and Dave Thomas, founders of The Pragmatic Programmers LLC, recently joined IEEE Software’s Editorial Board. Prior to joining Pragmatic Programmers, Hunt worked in various senior positions at Discreet Logic, Alias Research, Philips Medical Systems, and AT&T. He received his BS in information and computer science at the Georgia Institute of Technology. He is a member of the IEEE Computer Society, the ACM, and Independent Computer Consultants Association. Thomas cofounded and ran a software company in the United Kingdom prior to joining Pragmatic Programmers. Thomas holds an honor degree in computer science from LonAndy Hunt don University. He is a member of the IEEE Computer Society and the ACM. Hunt and Thomas have coauthored two books, The Pragmatic Programmer: From Journeyman to Master (AddisonWesley, 2000), and Programming Ruby: The Pragmatic Programmer’s Guide (Addison-Wesley, 2001). They have also written a number of articles together, including “Learning to Love Unit Testing” for Software Testing and Quality Engineering Dave Thomas Magazine (Jan. 2002) and “Programming in Ruby” for Dr. Dobb’s Journal (Jan. 2001). Individually and together, they have also given numerous talks and tutorials at conferences and workshops. Contact Andy Hunt at
[email protected] and Dave Thomas at
[email protected]; www.pragmaticprogrammer.com.
Robert Cochran, Catalyst Software (chair) Annie Kuntzmann-Combelles, Q-Labs Enrique Draier, PSINet Eric Horvitz, Microsoft Research David Hsiao, Cisco Systems Takaya Ishida, Mitsubishi Electric Corp. Dehua Ju, ASTI Shanghai Donna Kasperson, Science Applications International Pavle Knaflic, Hermes SoftLab Günter Koch, Austrian Research Centers Wojtek Kozaczynski, Rational Software Corp. Tomoo Matsubara, Matsubara Consulting Masao Matsumoto, Univ. of Tsukuba Dorothy McKinney, Lockheed Martin Space Systems Nancy Mead, Software Engineering Institute Stephen Mellor, Project Technology Susan Mickel, AgileTV Dave Moore, Vulcan Northwest Melissa Murphy, Sandia National Laboratories Kiyoh Nakamura, Fujitsu Grant Rule, Software Measurement Services Girish Seshagiri, Advanced Information Services Chandra Shekaran, Microsoft Martyn Thomas, Praxis Rob Thomsett, The Thomsett Company John Vu, The Boeing Company Simon Wright, Integrated Chipware Tsuneo Yamaura, Hitachi Software Engineering M A G A Z I N E O P E R AT I O N S C O M M I T T E E
George Cybenko (chair), James H. Aylor, Thomas J. Bergin, Frank Ferrante, Forouzan Golshani, Rajesh Gupta, Steve McConnell, Ken Sakamura, M. Satyanarayanan, Nigel Shadbolt, Munindar P. Singh, Francis Sullivan, James J. Thomas P U B L I C AT I O N S B O A R D
Rangachar Kasturi (chair), Mark Christensen, George Cybenko, Gabriella Sannitti di Baja, Lee Giles, Thomas Keefe, Dick Kemmerer, Anand Tripathi
January/February 2002
IEEE SOFTWARE
5
design Editor: Martin Fowler
■
T h o u g h t Wo r k s
■
[email protected]
Modeling with a Sense of Purpose John Daniels
I
t’s all a question of purpose. These days, practically everyone involved in developing software draws pictures that represent some aspect of the software or its requirements. They do this to improve their own understanding and, usually, to communicate that understanding to others. But all too often, the understanding is muddled and confused because the designer hasn’t clearly established the picture’s purpose or explained how others should interpret it. Surprisingly, this is true even when the designer uses an established modeling standard, such as the Unified Modeling Language (UML). For many years, it has been common practice for database designers to create logical models and physical models. Typically, designers represent a logical model using some form of the entity-relationship (ER) diagram, whereas they ultimately represent a physical model using a schema held by the database engine. The important difference is the level of abstraction: a logical model ignores the constraints that the underlying database technology imposes and presents a simplified view. Sometimes physical database designs are also drawn using ER diagrams but with explicit attributes for foreign keys that represent relationships. So, when presented with an ER diagram I’ve never seen before, I must establish its purpose before I can understand it. Fortunately, with an ER diagram, it’s usually easy to see what kind of model this is, but in other circumstances— 8
IEEE SOFTWARE
January/February 2002
especially with UML diagrams—it isn’t always so obvious. Implementation, specification, and conceptual models I call a model built to explain how something is implemented an implementation model and a more abstract model that explains what should be implemented a specification model. Both are models of software systems, and if confusion often exists between these two models, then far greater confusion surrounds the relationship between models of software and models of real-world situations. In software projects, we frequently need to find ways of gaining a better understanding of the real-world problem to be solved. Consequently, we often produce information models that depict the items of information in a particular business situation and their relationships, but two issues immediately arise. First, what should be the scope of such models? As we saw with the Great Corporate Data Modeling Fiasco of the 1980s and 1990s, where large enterprises invested heavily in attempts to capture all the information used in an organization, these models easily become so large that they are impossible to keep up to date. The solution to this issue is, of course, to model only those parts of the organization needed to represent the current situation of interest—not to worry too much about the bigger picture. Second, these models have no value unless we can directly apply their content to system-building projects. Unfortunately, for 0740-7459/02/$17.00 © 2002 IEEE
DESIGN
reasons that I will explain shortly, the approach claiming to offer the most assistance with this—object-oriented design—has been slow to deliver effective processes. I’ll refer to all models of situations in the world—of which information models are one example—as conceptual models; other terms that are used include essential models, domain models, business models, and even, most confusingly in my view, analysis models. So we have three kinds of models (see Figure 1) for quite different purposes: ■
■
■
Conceptual models describe a situation of interest in the world, such as a business operation or factory process. They say nothing about how much of the model is implemented or supported by software. Specification models define what a software system must do, the information it must hold, and the behavior it must exhibit. They assume an ideal computing platform. Implementation models describe how the software is implemented, considering all the computing environment’s constraints and limitations.
Most of the modeling I’ve done has been in the field of object and component systems, where opportunities for confusion between these three modeling perspectives are particularly large. This is because with object-oriented software, we are always striving to make the software elements that make up our program (classes, in most object-oriented languages) correspond closely to problem-domain concepts with similar names. So, if our conceptual model contains the concept of customer, our software will contain direct representations of customers, and our software customers will have similar attributes to their real-world counterparts. We want this correspondence because it improves traceability between requirements and code, and because it makes the software easier to understand. In this respect, object-oriented programming has been a great suc-
World
Built to understand, interpreted as statements of fact Model of the world Conceptual model
Specification
Software
Built to explain, Built to specify, interpreted as interpreted as descriptions of constraints on behavior behavior Abstract model Model of of software software Systematic correspondence Specification model
Implementation model
Figure 1. Three modeling perspectives.
cess, but the yearning for this correspondence between the conceptual and implementation models has oversimplified the process of moving between them. One reason why this process is more complicated than we’d like is that, given the complexity of modern layered application architectures, we find different representations of the same concept in many different parts of the application. Notation overload The examples we’ve looked at so far—information models and database models—are concerned with
Given the complexity of modern layered application architectures, we find different representations of the same concept in many different parts of the application.
modeling the structure of things; life gets much more complicated when we attempt to model behavior. If we construct an implementation model of an object-oriented program—for example, one written in Java—we assume that software objects cooperate by sending each other synchronous messages. It might or might not be satisfactory to adopt the same paradigm when creating specification models of object systems, but that paradigm is certainly useless if we want to capture the world’s concurrent and unpredictable nature in a conceptual model. Despite these obvious problems, many methodologists devising ways of designing object-oriented systems during the late 1980s and 1990s persisted with the claim that conceptual models could be built using only the concepts of object-oriented programming. Furthermore, they claimed that the process of moving from conceptual model to code was one that simply involved elaboration rather than mapping or translation. Although we’ve since learned the hard way that this is too simplistic, that realization hasn’t been manifested in the modeling notations we use or in the way people are often taught to use them. Consider, for example, the facilities in the UML for describing software structure. The primary and ubiquitous notation for this is the class diagram, January/February 2002
IEEE SOFTWARE
9
DESIGN
<
> Order
Conceptual model
deliveryCharge: Money
<> UrgentOrder
Figure 2. One notation, three models.
<> deliveryCharge: Money
<> Order deliveryCharge: Money isUrgent: Boolean
Specification model
<> DeliveryStrategy
beUrgent() <> Order deliveryCharge(): Money isUrgent(): Boolean beUrgent()
Implementation model
and we should be pleased that this relatively simple notation has proved useful for depicting everything from the structure of Web sites to the layout of XML documents. But its generality is part of the problem. Even within the limited scope of a single development project, you’ll typically find class diagrams used for a range of purposes, including the structural aspects of all three kinds of models discussed earlier. Figure 2 illustrates this with a simple example. In the conceptual model, we are happy to use modeling features that aren’t generally supported by programming environments, such as dynamic subtyping. In the implementation model, we use only features that our chosen language directly supports, and we show implementation choices such as the use of the strategy pattern here.
1
getCharge(): Money
<> RegularDelivery
<> UrgentDelivery
getCharge(): Money
getCharge(): Money
The same applies to all UML notations. Table 1 shows how I use the main UML notations within each kind of model. A blank cell indicates that I don’t use that notation within that kind of model, but surely you’ll be able to find someone who does. So it’s vital that whenever you use one of these notations, you indicate clearly both what kind of model this is part of and precisely what you’re trying to depict. The UML would be much better if it had a built-in understanding of the three kinds of model and at least insisted that you state which one you are building. Unfortunately, it doesn’t, so unless you are lucky enough to be working with UML tools that support profiles, the best you can do is use UML’s stereotype feature to mark model elements appropriately.
S
o, if a class diagram depicting an information model is, despite appearances, saying something fundamentally different from a class diagram specifying the chosen object structures in our software, doesn’t that just make life more difficult and confusing? No. By having these strong distinctions, we can model at different levels of abstraction at different times and separate the concerns that apply at the different levels. To get these benefits, though, modelers must be very clear in their minds about the different perspectives provided by the three kinds of models, and my experience is that even many experienced developers have yet to think clearly about all this. They need to get a sense of purpose. John Daniels is a consultant at Syntropy Limited. Contact
him at [email protected].
Table 1 Using the main UML notations with conceptual, specification, and implementation models Diagrams Use case Class Sequence or collaboration Activity Statechart
10
IEEE SOFTWARE
Conceptual model — Information models — Business processes Event-ordering constraints
January/February 2002
Specification model Software boundary interactions Object structures Required object interactions — Message-ordering constraints
Implementation model — Object structures Designed object interactions — Event or response definitions
software construction Editors: Andy Hunt and Dave Thomas ■ The Pragmatic Programmers a n d y @ p r a g m a t i c p r o g r a m m e r. c o m ■ d a v e @ p r a g m a t i c p r o g r a m m e r. c o m
Ubiquitous Automation Civilization advances by extending the number of important operations we can perform without thinking. —Alfred North Whitehead
Welcome to our new column on software construction. We hope that you’ll enjoy following this series as we explore the practical nutsand-bolts issues of building today’s software. Before we start, though, we need to talk about the column’s title. In some ways, “construction” is an unfortunate term, in that the most immediate (and often used) example involves building construction, with the attendant idea of an initial and inviolate architecture from which springs forth design and code.
Of course, nothing could be further from the truth. Software development, including its construction, is utterly unlike any other human endeavor. (Software development is also exactly the same as all other human endeavors, but that’s a topic for another time.) Software is unique in both its malleability and its ephemerality; it is (to borrow the title of a Thomas Disch book) the dreams our stuff is made of. Yet we cannot simply wish a software system into being. We must create it using some semblance of engineering practice. This tension between the nonrepeatable, ill-defined, chaotic creative process and the scientific, repeatable, well-defined aspects of engineering is what causes so much heartburn in practitioners, authors, and scholars. Is software engineering? Is it art? Is it craft? 0740-7459/02/$17.00 © 2002 IEEE
Many authors and pundits have compelling arguments for each of these viewpoints.1–3 This is understandable because all the views have merit— software development is all these things, and this plurality is what causes so much misunderstanding. When should we act like engineers, and when shouldn’t we? Many organizations still view coding as merely a mechanical activity. They take the position that design and architecture are of paramount importance and that coding should be a rigorous, repeatable, mechanistic process that can be relegated to inexpensive, inexperienced novices. We wish these organizations the best of luck. We’d like to think we know a better way. The SWEBOK (Software Engineering Body of Knowledge—view an online draft at www.swebok.org), for instance, states unequivocally that coding is far from a mechanistic translation of good design into code, but instead is one of those messy, imprecise human activities that requires creativity, insight, and good communication skills. Good software is grown organically and evolves; it is not built slavishly and rigidly. Design and coding must be flexible. We should not unduly constrain it in a misguided attempt to turn coders into robots. However, the way we construct software should not be arbitrary. It must be perfectly consistent, reliable, and repeatable, time after time. And that’s the topic of this first column. —Andy Hunt and Dave Thomas
References 1. P. McBreen, Software Craftsmanship: The New Imperative, Addison-Wesley, Reading, Mass., 2001. 2. T. Bollinger, “The Interplay of Art and Science in Software,” Computer, vol. 30, no. 10, Oct. 1997, pp. 128, 125–126. 3. W. Humphrey, Managing the Software Process, Addison-Wesley, Reading, Mass., 1989. January/February 2002
IEEE SOFTWARE
11
SOFTWARE DEPT CONSTRUCTION TITLE
I
f software construction is to involve engineering, the process must be consistent and repeatable. Without consistency, knowing how to build, test, or ship someone else’s software (or even your own software two years later) is hard, if not impossible. Without repeatability, how can you guarantee the results of a build or a test run— how do you know the 100,000 CDs you just burned contain the same software you think you just tested? The simple answer is discipline: software construction must be disciplined if it is to succeed. But people don’t seem to come that way; in general, folks find discipline hard to take and even harder to maintain. Fortunately, there’s a pragmatic solution: automate everything you can, so that the development environment itself provides the disciplined consistency and repeatability needed to make the process run smoothly and reliably. This approach leaves developers free to work on the more creative (and fun) side of software construction, which makes the programmers happier. At the same time, it makes the accountants happier by creating a development organization that is more efficient and less prone to costly human-induced errors and omissions.
that every developer on a project compiles his or her software the same way, using the same tools, against the same set of dependencies. Make sure this compilation process is automated. This advice might seem obvious, but it’s surprising how often it’s ignored. We know teams where some developers compile using the automatic dependencies calculated by their integrated development environment, others use a second IDE, and still others use command-line build tools. The result? Hours wasted every week tracking down problems that aren’t really problems, as each developer tests against subtly different executables generated by his or her individual compilation system. Nowadays, automated compilations are pretty straightforward. Most IDEs offer a single-key “compile the project” command. If you’re using Java from the command line, there’s the Ant tool (http://jakarta.apache.org/ ant/index.html). In other environments, the “make” system, or less common variants such as Aegis (www. pcug.org.au/~millerp/aegis/aegis.html), do the same job. Whatever the tool, the ground rules are the same: provide a single command that works out what needs to be done, does it, and reports any errors encountered.
Compilation automation If you do nothing else, make sure
Testing automation During construction, we use unit
tests to try to find holes in our software. (There are many other, equally important, reasons for using unit tests, but that’s the subject of another article.) However, most developers we’ve seen skip unit testing or at best do it in an ad hoc way. The standard technique goes something like this: 1. Write a wad of code. 2. Get scared enough about some aspect of it to feel the need to try it. 3. Write some kind of driver that invokes the code just written. Add a few print statements to the code under test to verify it’s doing what you thought it should. 4. Run the test, eyeball the output, and then delete (or comment out) the prints. 5. Go back to Step 1. Let us be clear. This is not unit testing. This is appeasing the gods. Why invest in building tests only to throw them away after you’ve run them once? And why rely on simply scanning the results when you can have the computer check them for you? Fortunately, easy-to-use automated testing frameworks are available for most common programming languages. A popular choice is the set of xUnit frameworks, based on the Gamma/Beck JUnit and SUnit systems (www.xprogramming.com/software. htm). We’ll look at these frameworks
Career Opportunities PURDUE UNIVERSITY Department of Computer Sciences
The Department of Computer Sciences at Purdue University invites applications for tenure-track positions beginning August 2002. Positions are available at the assistant professor level; senior positions will be considered for highly qualified applicants. Applications from outstanding candidates in all areas of computer science will be considered. Areas of particular interest include security, networking and distributed systems, scientific computing, and software engineering. The Department of Computer Sciences offers a stimulating and nurturing academic environment. Thirty-five faculty members have research programs in analysis of algorithms, bioinformatics,
12
IEEE SOFTWARE
January/February 2002
compilers, databases, distributed and parallel computing, geometric modeling and scientific visualization, graphics, information security, networking and operating systems, programming languages, scientific computing, and software engineering. The department implements a strategic plan for future growth which is strongly supported by the higher administration. This plan includes a new building expected to be operational in 2004 to accommodate the significant growth in faculty size. Further information about the department is available at http://www.cs.purdue.edu. Applicants should hold a Ph.D. in Computer Science, or a closely related discipline, and should be committed to excellence in teaching and have demonstrated strong potential for excellence in
research. Salary and benefits are highly competitive. Special departmental and university initiatives are available for junior faculty. Candidates should send a curriculum vitae, a statement of career objectives, and names and contact information of at least three references to: Chair, Faculty Search Committee Department of Computer Sciences Purdue University West Lafayette, IN 47907-1398 Applications are being accepted now and will be considered until the positions are filled. Inquiries may be sent to [email protected]. Purdue University is an Equal Opportunity/Affirmative Action employer. Women and minorities are especially encouraged to apply.
SOFTWARE DEPT CONSTRUCTION TITLE
in later articles. For now, it’s enough to point out that they are simple to use, composable (so you can build suites of tests), and fully automated. Remember when we talked about making the build environment consistent? Well, the same applies to testing: you need repeatable test environments. However, that’s not to say they should all be the same. Test in a wide variety of situations—some bizarre, some commonplace. Expose your code to the true range of target environments. Do it as early as possible during development. And do it as automatically as you can. Once you can compile code with a button press, and test it soup-to-nuts, what’s next? Integration, reviews, release, perhaps—all fertile ground for the seeds of automation. And then there’s the lowly shipping process. Here automation is especially crucial. If you can’t reliably go from source in the repository to code on a CD, how do you know what you’re shipping? Every delivery is a stressful situation, where each manual step must be completed correctly if the four-billion-odd bits on the CD are to be the correct ones. Too often, though, transferring and replicating completed software is a heavily manual process, one that allows the introduction of all sorts of needless risks. What this part of the whole process needs is an automated delivery build. What is an automated delivery build? At a minimum, it involves clean-room compilation, automated running of the unit tests, automated running of whatever functional tests can be automated, and automated collection of the deliverable components (not forgetting documentation) into some staging area. (Speaking of documentation, how much do you currently produce automatically? How much could you? After all, the more documentation you automate, the more likely it will be up-to-date.) Once you’ve generated your shippable software, take it to the next level, and automate the testing of the result. Can the pile of bits you’re proposing to deliver be installed in typical customer environments? Does
it work when it gets there? Think about how much of this process you can automate, allowing the quality assurance staff to concentrate on the hard stuff. Never underestimate the effort required to do all this. The cycle times of testing the process are large. Every small problem means starting again, compiling and testing and so on. However, there’s a trick: don’t leave build automation until the end. Instead, make it part of the very first project iteration. At this stage it will be simple (if incomplete). Then arrange for it to run at least daily. This has two advantages. First, the build’s products give your quality assurance folks something to work with: anyone can take a build snapshot at any time. Second, you get to find out that the delivery build process failed the day it failed. The fixes will be immediate and incremental.
have no shoes. Often, people who develop software use the poorest tools to do the job. You can do better than that. Learn how to automate your environment—write macros if your IDE supports it; grab your favorite scripting language if it doesn’t (Ruby is a good choice; see www.rubylang.org). Get your computer to do as much of the repetitive and mundane work as you can. Then you can move on to the really hard stuff. See you next issue. Andy Hunt and Dave Thomas are partners in The
Pragmatic Programmers, LLC. They feel that software consultants who can't program shouldn't be consulting, so they keep current by developing complex software systems for their clients. They also offer training in modern development techniques to programmers and their management. They are coauthors of The Pragmatic Programmer and Programming Ruby, both from Addison-Wesley, and speak on development practices at conferences around the world. Contact them via www. pragmaticprogrammer.com.
Other uses of automation We’ve barely scratched this topic’s surface. Consider that you needn’t restrict automation to the build or even to just development. Perhaps your project has strict reporting requirements, review and SOFTWARE ENGINEERING approval processes, FACULTY POSITION or other onerous The Aeronautics and Astronautics Department at MIT, a leader in the secretarial chores. design and development of complex aircraft, space, transportation, information and communications systems, has a faculty opening in the Where possible, try Aerospace Information Systems Division. The department seeks to automate these candidates for a position in software engineering for aerospace bits of drudgery as applications, available in September 2002. The successful candidate will have a Ph.D. and relevant SW engineering research credentials in one or well. There’s more more of these areas: requirements specification and analysis, to constructing assurance techniques, human-machine interaction, software design for software than just embedded systems, safety, reliability and other quality attributes, software fault tolerance, or real-time application issues like scheduling constructing softand verification. A joint professorship with Computer Science is possible. ware, and we can MIT encourages women and underrepresented minorities to apply. Send leverage automatwo copies of cover letter (which includes a statement of interest) and of tion throughout the your c.v. with the names and addresses of three individuals who will provide letters of recommendation, by January 15, 2002, to: Professor process.
MIT
The cobbler’s children As we said in The Pragmatic Programmer (AddisonWesley, 2000), the cobbler’s children
Edward F. Crawley, Head, MIT Department of Aeronautics and Astronautics, 33-207, 77 Massachusetts Avenue, Cambridge, MA 02139-4307; or by electronic mail to: [email protected] (MS Word or plain text). http://web.mit.edu/aeroastro/www/core/html
MASSACHUSETTS INSTITUTE OF TECHNOLOGY An Equal Opportunity/Affirmative Action Employer Non-Smoking Environment web.mit.edu/personnel/www
January/February 2002
IEEE SOFTWARE
13
focus
guest editors’ introduction
Building Software Securely from the Ground Up Anup K. Ghosh, Cigital Chuck Howell, MITRE James A. Whittaker, Florida Institute of Technology
s software professionals first and computer security researchers second, we believe that most computer security articles published in industry magazines focus on fixing security problems after the fact. Indeed, many of these “fixes” tend to be point solutions that address symptoms of a much larger problem and fail to address the root causes of the real computer security problem. Our assessment of the state of the industry motivated us to pull together a special theme issue for IEEE Software
A
focusing on how to build software systems securely from the ground up. Of course, we’re not the only people who believe that systems must be built this way. Richard Clarke, the US President’s Special Advisor for Cyberspace Security, admonished our industry at the Business Software Alliance’s Global Technology Summit in Washington, D.C., on 4 December 2001: To start, members of the IT industry must build information security into their products at the point of development and not treat it as an afterthought.
Break the cycle The types of approaches and point solutions advocated by computer security professionals to date have aimed at system administrators, chief information officers, and other personnel involved in system infrastructure management. These solutions usu14
IEEE SOFTWARE
January/February 2002
ally focus on addressing an enterprise’s security defenses (such as firewalls, routers, server configuration, passwords, and encryption) rather than on one of the key underlying causes of security problems—bad software. Although point solutions are often necessary, they are inadequate for securing Internet-based systems. Point solutions are temporary barriers erected to stem the tide of attacks from unsophisticated attackers. In a sense, point solutions are simply bandages on a broken machine; they might stop some bleeding, but the wound underneath still festers. Fundamentally, we believe the problem must be fixed at its core by building secure, robust, survivable software systems. For this issue, we solicited articles from software researchers and professionals to provide guidance and innovation for building software systems to be resistant to attack. We start from the premise that most se0740-7459/02/$17.00 © 2002 IEEE
curity breaches in practice are made possible by software flaws. Statistics from the Software Engineering Institute’s CERT Coordination Center on computer security incident reports support this assertion (see the “Further Reading” sidebar). But engineering secure and robust software systems can break the penetrate-and-patch cycle of software releases we have today (see the “Penetrate and Patch Is Bad” sidebar). The key goal of this issue is to encourage a deeper understanding of how security concerns should influence all aspects of soft-
ware design, implementation, and testing. A notorious example of poor software implementation is the buffer overflow vulnerability. Known for decades, and very troublesome in networked systems, it continues to be introduced into new software at an alarming rate, due in part to software development habits that trace back to isolated systems where such flaws had few security implications. Software designers in a networked world cannot pretend to work in isolation. People are a critical part of the full software secu-
Penetrate and Patch Is Bad
D Pa isc tc lo h su re re le as ed Sc rip tin g
Many well-known software vendors don’t understand that security is not an add-on feature. They continue to design and create products at alarming rates, with little attention paid to security. They start to worry about security only after someone publicly (and often spectacularly) breaks their products. They then rush out a patch instead of realizing that adding in security from the start might be a better idea. This sort of approach won’t do in e-commerce or other businesscritical applications. We should strive to minimize the unfortunately pervasive penetrate-and-patch approach to security and avoid desperately trying to come up with a fix to a problem that attackers are actively exploiting. In simple economic terms, finding and removing bugs in a software system before its release is orders-of-magnitude cheaper and more effective than trying to fix systems after release.1 The penetrate-and-patch approach to security presents many problems:
Intrusions
Gary McGraw
Time
Figure A. An average curve of the number of intrusions by a security bug over time.
■
Developers can only patch problems they know about. Attackers usually don’t report the problems they find to developers. ■ Vendors rush out patches as a result of market pressures and often introduce new problems into the system. ■ Patches often only fix a problem’s symptoms; they do nothing to address the underlying cause. ■ Patches often go unapplied, because system administrators tend to be overworked and often do not wish to make changes to a system that “works.” In many cases, system administrators are generally not security professionals. Designing a system for security, carefully implementing the system, and testing the system extensively before release presents a much better alternative. The fact that the existing penetrate-and-patch approach is so poorly implemented is yet another reason why we must change it. In the December 2000 Computer article “Windows of Vulnerability: A Case Study Analysis,” Bill Arbaugh, Bill Fithen, and John McHugh discuss a life-cycle model for system vulnerabilities that
emphasizes how big the problem is.2 Data from their study shows that intrusions increase once a vulnerability is discovered, the rate continues to increase until the vendor releases a patch, and exploits continue to occur even after the patch is issued (sometimes years after). Figure A is based on their data. It takes a long time before most people upgrade to patched versions because most people upgrade for newer functionality or the hope of more robust software or better performance, not because they know of a real vulnerability.
References 1. F. Brooks Jr., The Mythical Man-Month: Essays on Software Engineering, 2nd ed., Addison-Wesley, Reading, Mass., 1995. 2. B. Arbaugh, B. Fithen, and J. McHugh, “Windows of Vulnerability: A Case Study Analysis,” Computer, vol. 33, no. 12, Dec. 2000, pp. 52–59. Gary McGraw is Cigital’s chief technology officer and a noted authority on mobile-code se-
curity. His research interests include software security and software risk management. He received a BA in philosophy from the University of Virginia and a PhD in cognitive science and computer science from Indiana University. Contact him at [email protected].
January/February 2002
IEEE SOFTWARE
15
Further Reading Books: rity equation, and software that makes unrealistic or unreasonable security-related demands on users (for example, requiring them to memorize too many passwords that change too often) is software whose security will inevitably be breached. Finally, although individuals exploiting flawed software cause most security breaches today, a more ominous emerging threat is malicious code that is deliberately written to exploit software flaws. Examples are the Code Red and Nimda worms and their variants, which can spread on an Internet scale in Internet time. To effectively address current and future problems in computer security, we must build software securely from the ground up.
Building Secure Software: How to Avoid Security Problems the Right Way, by John Viega and Gary E. McGraw, Professional Computing Series, Addison-Wesley, ISBN 0-201-72152-X, Reading, Mass., 2001. Computer Related Risks, by Peter G. Neumann, Addison-Wesley, ISBN 0-20155805-X, Reading, Mass., 1995. Security Engineering: A Guide to Building Dependable Distributed Systems, by Ross J. Anderson, John Wiley & Sons, ISBN 0-471-13892-6, New York, 2001. Security and Privacy for E-Business, by Anup K. Ghosh, John Wiley & Sons, ISBN 0471-38421-6, New York, 2001.
The articles We are fortunate to have received numerous outstanding article contributions for this issue—clearly speaking to the importance of this problem. We chose articles to cover a spectrum of software security issues—two case studies on principled methods for building secure systems, development of advanced techniques for creating trust to facilitate software component integration, and source code analysis for software vulnerability detection. “Correctness by Construction: Developing a Commercial Secure System” by Anthony Hall and Roderick Chapman pro-
CERT Coordination Center: www.cert.org Common Vulnerabilities and Exposures: http://cve.mitre.org BugTraq: www.securityfocus.com/archive/1
About the Authors Anup K. Ghosh is vice president of research at Cigital and an expert in e-commerce security. He is the author of E-Commerce Security: Weak Links, Best Defenses (John Wiley & Sons, 1998) and Security and Privacy for E-Business (John Wiley & Sons, 2001). His interests include intrusion detection, mobile-code security, software security certification, malicious software detection/tolerance, assessing the robustness of Win32 COTS software, and software vulnerability analysis. He has served as principal investigator on grants from the US National Security Agency, DARPA, the Air Force Research Laboratory, and NIST’s Advanced Technology Program. Contact him at [email protected]. Chuck Howell is consulting engineer for software assurance in the Information Systems and Technology Division at the MITRE Corporation. The Division focuses on exploring, evaluating, and applying advanced information technologies in critical systems for a wide range of organizations. He is the coauthor of Solid Software (Prentice Hall, 2001). Contact him at [email protected].
James A. Whittaker is an associate professor of computer science and director of the
Center for Software Engineering Research at the Florida Institute of Technology. His interests involve software development and testing; his security work focuses on testing techniques for assessing software security and the construction of tool to protect against any manner of malicious code. Contact him at [email protected].
16
IEEE SOFTWARE
January/February 2002
Web resources:
vides an interesting case study in which good software engineering practices and formal methods are used to demonstrate that we can build practical, secure systems from insecure commercial off-the-shelf components. On a similar theme, “EROS: A Principle-Driven Operating System from the Ground Up” by Jonathan Shapiro and Norm Hardy offers an experience report about building a capabilities-based operating system from the ground up to be verifiably secure. In “Composing Security-Aware Software,” Khaled Md Khan and Jun Han describe their component security characterization framework for composing trust in systems by exposing the components’ security properties through active interfaces. Finally, “Improving Security Using Extensible Lightweight Static Analysis” by David Evans and David Larochelle describes a lightweight static analysis tool for software developers to eliminate vulnerabilities in source code prior to releasing software.
W
e hope you find these articles as enlightening on building secure software systems as we found them enjoyable. Please see the “Further Reading” sidebar for additional resources on this important topic.
focus
building software securely
Correctness by Construction: Developing a Commercial Secure System Anthony Hall and Roderick Chapman, Praxis Critical Systems
hen you buy a car, you expect it to work properly. You expect the manufacturer to build the car so that it’s safe to use, travels at the advertised speed, and can be controlled by anyone with normal driving experience. When you buy a piece of software, you would like to have the same expectation that it will behave as advertised. Unfortunately, conventional software construction methods do not provide this sort of confidence: software often behaves in completely
W Praxis Critical Systems recently developed a secure Certification Authority for smart cards. The CA had to satisfy demanding performance and usability requirements while meeting stringent security constraints. The authors show how you can use techniques such as formal specification and static analysis in a realistic commercial development. 18
IEEE SOFTWARE
unexpected ways. If the software in question is security- or safety-critical, this uncertainty is unacceptable. We must build software that is correct by construction, not software whose behavior is uncertain until after delivery. Correctness by construction is possible and practical. It demands a development process that builds correctness into every step. It demands rigorous requirements definition, precise system-behavior specification, solid and verifiable design, and code whose behavior is precisely understood. It demands defect removal and prevention at every step. What it does not demand is massive spending beyond the bounds of commercial sense. On the contrary, the attention to correctness at early stages pays off in reduced rework costs. When we build software in this way, we give a warranty that it will behave as specified, and we don’t lose any sleep over the cost of honoring this warranty.
January/February 2002
This article describes how we applied this philosophy to the development of a commercial secure system. The system had to meet normal commercial requirements for throughput, usability, and cost as well as stringent security requirements. We used a systematic process from requirements elicitation through formal specification, user interface prototyping, rigorous design, and coding in Spark, to ensure these objectives’ achievement. System validation included tool-assisted checking of a formal process design, top-down testing, system testing with coverage analysis, and static code analysis. The system uses commercial offthe-shelf hardware and software but places no reliance on COTS correctness for critical security properties. We show how a process that achieves normal commercial productivity can deliver a highly reliable system that meets all its throughput and usability goals. 0740-7459/02/$17.00 © 2002 IEEE
List of Abbreviations Background Praxis Critical Systems recently developed the Certification Authority for the Multos smart card scheme on behalf of Mondex International (MXI).1 (See the “List of Abbreviations” and “Useful URLs” sidebars for more information.) The CA produces the necessary information to enable cards and signs the certificates that permit application loading and deletion from Multos cards. Obviously, such a system has stringent security constraints. It must simultaneously satisfy commercial requirements for high throughput and good usability by its operators. The combination of security and throughput requirements dictated a distributed system with several processors. Furthermore, to meet the development budget and timescale, we could not build the system from scratch, requiring use of COTS hardware and infrastructure software. MXI was keen for a predictable development process, with no surprises and minimum risk. They also wanted to develop according to the UK Information Technology Security Evaluation Criteria,2 one of the forerunners of the Common Criteria.3 The CA supports smart cards that are certified to the highest ITSEC level, E6. This requires a stringent development process including the use of formal methods at early stages. Previous experience had shown that, when properly applied, the E6 process forced the customer and supplier to explicitly and unambiguously understand system requirements, which avoided unpleasant surprises in late testing. We therefore developed the CA to the standards of E6. The development approach Correctness by construction depends on knowing what the system needs to do and being sure that it does it. The first step, therefore, was to develop a clear requirements statement. However, developing code reliably from requirements is impossible: the semantic gap is too wide. So, we used a sequence of intermediate system descriptions to progress in tractable, verifiable steps from the user-oriented requirements to the system-oriented code. At each step, we typically had several different descriptions of various system aspects. We ensured that these descriptions were consistent with each
Certification Authority commercial off-the-shelf Formal Security Policy Model formal top-level specification high-level design Information Technology Security Evaluation Criteria user interface specification user requirements
CA COTS FSPM FTLS HLD ITSEC UIS UR
Requirements
User requirements
Formal security policy model Specification & architecture
Design
Formal top-level specification
High-level design
Database design
User interface specification
Process design
Ul design Data dictionary
Module structure
Build specification
Supplementary design documents Code
Package Process and specifications task code
Database code
Package bodies
Window class code
Figure 1. Development other, and we ensured that they were correct deliverables grouped into the main process with respect to earlier descriptions. At each stage, we used descriptions that steps.
were as formal as possible. This had two benefits. Formal descriptions are more precise than informal ones, which forced us to understand issues and questions before we actually got to the code. Additionally, more powerful verification methods for formal descriptions exist than methods for informal ones, so we had more confidence in each step’s correctness. Figure 1 shows the overall set of deliverables from the development process, grouped into the main process steps. Requirements Before we thought about the CA system, January/February 2002
IEEE SOFTWARE
19
Useful URLs www.sparkada.com www.mondex.com www.multos.com www.cesg.gov.uk www.commoncriteria.org www.afm.sbu.ac.uk
Spark Mondex Multos ITSEC Common Criteria Formal methods and Z
we had to understand the business environment in which it would operate. We used our well-tried requirements-engineering method, Reveal,4 to define the CA’s environment and business objectives and to translate these into requirements for the system. We wrote the user requirements (UR) document in English with context diagrams,5 class diagrams, and structured operation definitions,6 providing a first step toward formalizing the description. We labeled every requirement so that we could trace it, and we traced every requirement to its source. We traced each security requirement to the corresponding threats. We carried out this tracing through all the development documents down to the code and tests. We validated user requirements through client review and manual verification for consistency between different notations. The highest levels of ITSEC and the Common Criteria require a Formal Security Policy Model (FSPM). The user requirements included an informal security policy that identified assets, threats, and countermeasures. This contained 28 technical items. Of these, we formalized the 23 items related to viewing the system as a black box rather than the five that dealt with implementation details. Specification and architecture This phase covered two activities, carried out in parallel: ■ ■
detailed system behavior specification and high-level design (HLD).
The system specification comprises two closely related documents: the user interface specification and the formal top-level specification. Together, these completely define the CA’s black-box behavior. The UIS defines the system’s look and feel. We developed it by building a user interface prototype, which we validated with the CA operational staff. The FTLS defines the functionality behind 20
IEEE SOFTWARE
January/February 2002
the user interface. We derived it from the functions identified in the UR, the constraints in the FSPM, and the user interface prototype’s results. We used a typechecker to verify that the FTLS was well formed, and we validated it by checking against the user requirements and FSPM and by client review. The HLD contained descriptions of the system’s internal structure and explanations of how the components worked together. Several different descriptions looked at the structure in varying ways, such as in terms of ■ ■ ■
distribution of functionality over machines and processes, database structure and protection mechanisms, and mechanisms for transactions and communications.
The HLD aimed mainly to ensure satisfaction of security and throughput requirements. One of the most important parts was achieving security using inherently insecure COTS components, such as a commercial database. We did this on the basis of our experience using COTS in safety-critical systems. Specifically, we did not rely on COTS for confidentiality or integrity. We achieved these by ■ ■ ■ ■
hardware separation; confidential information encryption; message authentication codes, where data integrity was needed; and individual processing of security-critical functions to avoid reliance on separation between processes.
Detailed design The detailed design defined the set of software modules and processes and allocated the functionality across them. It also, when necessary, provided more detail for particular modules. We deliberately did not write a detailed design of every system aspect. Often, the FTLS and module structure gave enough information to create software directly. However, a few cases existed, where the implementer required much more information than the FTLS had. For example, we used Z7—a mathematical language supported by English descriptions—to specify the module that manages cryptographic keys and their
verification on system startup. This specification helped us resolve many difficult issues before coding. We used different notations for the design documents, according to their subject matter, always following the rule of maximum practical formality. The most novel part was the process design. Code Coding the CA posed some interesting challenges. For example, the CA’s expected life span is some 20 years. With this in mind, we viewed development technologies that were particularly fashionable at the time with some skepticism. Additionally, although most development favors the reuse of COTS components, we tried to avoid these in the CA’s development as far as was practical. For instance, we implemented a remote procedure call mechanism entirely from scratch, rather than relying on commercial middleware such as DCOM or an implementation of Corba. We aimed for five nines availability (that is, 99.999 percent) in security-critical parts. Housed in a tamperproof environment, the system cannot be rebooted without some effort, so we spent considerable effort in ensuring the system’s stability. We estimated actual maintenance of the system’s tamperproof parts (for example, installation of new software builds) to be possible only once every six months. Other challenges included the client’s requirement of a commercial operating system (in this case, Windows NT 4) and the ITSEC (and more recently the Common Criteria) requirements for languages that “unambiguously define the meaning of all statements” used in a system’s implementation. We soon realized that no single implementation language could do the job. Our experience with safety-critical system development suggested that implementation in Ada95 would suit the system’s critical parts. However, Ada95 was clearly not appropriate at the time for the GUI’s development. Ultimately, we settled on a “right tools for the job” mix of languages. So, we implemented the system’s security-enforcing kernel in Spark Ada8—an annotated subset of Ada95 widely used in safety-critical systems—whose properties make it suitable for secure system development. (See the “Spark
and the Development of Secure Systems” sidebar for more information.) We implemented the system’s infrastructure (for example, remote procedure call mechanisms and concurrency) in Ada95. The system’s architecture carefully avoids any securityrelated functionality in the GUI, so we implemented this in C++, using Microsoft’s Foundation Classes. We used some small parts, such as device drivers for cryptographic hardware, and one standard cryptographic algorithm as is. We reviewed the C source code for these units by hand. We identified some of the system’s technical aspects early on as carrying some risk. These included the implementation of a concurrent Ada program as a Win32 service, use of the Win32 and Open Database Connectivity APIs from Ada, and linking the C++ GUI with the underlying Ada application software. We attacked these risks using “trailblazing” activities such as implementing small demonstrator applications. For all system parts, we enforced rigorous coding standards. We reviewed all the code against these standards and relevant source documents, such as the FTLS and UIS. We also used automatic static-analysis tools where possible, such as the Spark Examiner for Ada, and BoundsChecker and PC-Lint for C++.
Although most development favors the reuse of COTS components, we tried to avoid these in the CA’s development as far as was practical.
Verification and validation Correctness by construction does not claim zero defects: we do not believe that this is achievable in any engineering artifact. We do, however, have zero tolerance of defects. We try to find them as early as possible, and then we eliminate them. Furthermore, we collect data on defect introduction and removal and try to improve the process to reduce introduction of defects and to speed defect discovery. The first line of attack is review. We review all deliverables to check ■ ■ ■
correctness with respect to earlier deliverables, conformance to standards, and internal consistency.
Wherever possible, we carry out automated verification and validation on deliverables. As you’ll see in the next section, we were able to do some automated checks on January/February 2002
IEEE SOFTWARE
21
Spark and the Development of Secure Systems The Spade Ada Kernel (Spark) is a language designed for constructing high-integrity systems. The language’s executable part is a subset of Ada95, but the language requires additional annotations to let it carry out data- and information-flow analysis1 and to prove properties of code, such as partial correctness and freedom from exceptions. These are Spark’s design goals: Logical soundness. The language should have no ambiguities. Simplicity of formal description. It should be possible to describe the whole language in a relatively simple way. ■ Expressive power. The language should be rich enough to construct real systems. ■ Security. It should be possible to determine statically whether a program conforms to the language rules. ■ Verifiability. Formal verification should be theoretically possible and tractable for industrial-sized systems. ■ ■
grams, which lets you show that a program corresponds with some suitable formal specification. This allows formality in a systems’ design and specification to be extended throughout its implementation. Proof of the absence of predefined exceptions (for such things as buffer overflows) offers strong static protection from a large class of security flaw. Such things are anathema to the safety-critical community yet remain a common form of attack against networked computer systems. Attempting such proofs also yields interesting results. A proof that doesn’t come out easily often indicates a bug, and the proof forces engineers to read, think about, and understand their programs in depth. Experience on other projects suggests that proof is a highly costeffective verification technique.3 You can compile Spark without a supporting runtime library, which implies that you can deliver an application without a commercial off-the-shelf component. This might offer significant benefits at the highest assurance levels, where evaluation of such components remains problematic. Spark is amenable to the static analysis of timing and memory usage. This problem is known to the real-time community, where analysis of worst-case execution time is often required. When developing secure systems, you might be able to use such technology to ensure that programs exhibit as little variation in timing behavior as possible, as a route to protect against timing-analysis attacks. You can access more information about Spark at www.sparkada.com.
Spark annotations appear as comments (and so are ignored by a compiler) but are processed by the Examiner tool. These largely concern strengthening the “contract” between a unit’s specification and body (for instance, specifying the information flow between referenced and updated variables). The annotations also enable efficient checking of language rules, which is crucial for using the language in large, real-world applications. Spark has its roots in the security community. Research in the 1970s into information flow in programs2 resulted in Spade Pascal and, eventually, Spark. Spark is widely used in safetycritical systems, but we believe it is also well suited for developing secure systems. References Particularly, it offers programwide, complete data- and in1. J-F. Bergeretti and B.A. Carré, “Information-Flow and Data-Flow Analyformation-flow analysis. These analyses make it impossible for sis of While Programs,” ACM Trans. Programming Languages and Systems, vol. 7, no. 1, Jan. 1985, pp. 37–61. a Spark program to contain a dataflow error (for example, the 2. D.E. Denning and P.J. Denning, “Certification of Programs for Secure Inuse of an uninitialized variable), a common implementation erformation Flow,” Comm. ACM, vol. 20, no. 7, July 1977, pp. 504–513. ror that can cause subtle (and possibly covert) security flaws. 3. S. King et al., “Is Proof More Cost-Effective Than Testing?” IEEE Trans. Software Eng., vol. 26, no. 8, Aug. 2000, pp. 675–686. You can also achieve proof of correctness of Spark pro-
formal specifications and designs, which helped to remove defects early in the process. Our main verification and validation method is, of course, testing. Traditionally, critical-systems testing is very expensive. A reason for this is that testing occurs several times: we test individual units, integrate them and test the integration, and then test the system as a whole. Our experience with previous safety-critical projects9 suggests that this approach is inefficient and particularly that unit testing is ineffective and expensive. Unit testing is ineffective because most errors are interface errors, not internal errors in units. It is expensive because we must build test harnesses to test units in isolation. We adopted a more efficient and effective 22
IEEE SOFTWARE
January/February 2002
approach. We incrementally built the system from the top down. Each build was a real, if tiny, system, and we could exercise all its functions in a real system environment. This reduced the integration risk. We derived the tests directly from the system specification. We ran the tests using Rational’s Visual Test, so that all tests were completely automated. Furthermore, we instrumented the code using IPL’s AdaTest so that we measured the statement and branch coverage we were achieving by the system tests. We devised extra design-based test scenarios only where the system tests failed to cover parts of the code. Formal methods We used Z to express the FSPM. We
based our approach on the Communications-Electronics Security Group’s Manual “F”10 but simplified the method. The informal security policy contained four different kinds of clause, each giving rise to a different kind of predicate in the FSPM: ■
■
■
■
Two clauses constrained the system’s overall state (each became a state invariant in the formal model). Eight clauses required the CA to perform some function (for example, authentication). Sixteen clauses were constraints applicable to every operation (for example, that only authorized users could perform them). One clause was an information separation clause.
Information separation is harder to express in Z than other properties, and other formal languages such as CSP can express it more directly. However, we found the other 24 clauses straightforward to express in Z, so Z proved a good choice of language overall. Because we wrote the FSPM in Z, we could check some aspects of its internal consistency using a typechecker. We reviewed it for correctness with respect to the informal security policy. We did not carry out any proofs of correctness; although, in other projects, we found these to be effective in finding errors.9 Formal top-level specification The FTLS is a fairly conventional Z specification. However, it contains some special features to allow checking against the FSPM. In conventional Z, one or two schemas express an operation. In the FTLS, we used numerous schemas to capture each operation’s different security-relevant aspects. We used separate schemas to define each operation’s inputs, displayed information, and outputs. This let us trace clearly to FSPM restrictions on what is displayed and how outputs are protected. We used separate schemas to define when an operation was available or valid. This let us distinguish both errors that are prevented by the user interface and those that are checked once we confirm the operation and thus cause error messages to be displayed. We
also modeled errors in detail, to satisfy the requirement of reporting all errors. Process design We modeled the process structure in the CSP language. We mapped sets of Z operations in the FTLS to CSP actions. We also introduced actions to represent interprocess communications. This CSP model let us check if the overall system was deadlockfree and if there was no concurrent processing of security-critical functions. These checks were carried out automatically, using Formal Systems Europe’s failuresdivergence refinement tool. This helped us find significant flaws in our first design and gave us much greater confidence in the final design. Furthermore, implementing the design using Ada95 tasks, rendezvous, and protected objects was straightforward. We devised rules for systematically translating CSP into code. This successfully yielded code that worked the first time, a rare experience among concurrent programmers.
To enable static analysis to produce useful results, the language must be as precise as possible.
Programming languages and static analysis Given the formality of the specification and design, we hoped to carry this through into the implementation. ITSEC and Common Criteria require the use of programming languages with “unambiguous meaning,” yet the use of formal implementation languages remains rare. Experience suggests that, despite sound protocols and cryptography, sloppy implementation remains a common source of failure in supposedly secure systems—the ubiquitous “buffer overflow” attack, for instance. A cure for sloppy implementation is a formal implementation language for which we can carry out static analysis—that is, analyzing program properties such as information flow without actually running the program. To enable static analysis to produce useful results, the language must be as precise as possible. In the presence of ambiguity, static analysis must make assumptions (for example, “The compiler evaluates expressions left-to-right”) that can render the results dubious. Alternatively, it might attempt to cover all possible outcomes of any ambiguity—this leads to an explosion in analysis time that makes the tool unusable. January/February 2002
IEEE SOFTWARE
23
Figure 2. Life-cycle phases where defects were introduced and where they were detected and removed.
Specification
0
Architecture
1
57 14 4 Design Code Developer test Customer test Operation
9
38 8
23
0 18 4 0 3 117 1 0 0 115 0 0 6 3 0 0 0 0 0
Table 1 Distribution of effort. Activity
Requirements Specification and architecture Code Test Fault fixing Project management Training Design authority Development- and target-environment
Effort (%)
2 25 14 34 6 10 3 3 3
Unfortunately, few languages meet this need. Trends in programming-language design have favored dynamic behavior (for example, late binding of function calls in object-oriented languages) and performance (for example, unchecked arithmetic and array access in C) over safety. These features are dramatically at odds with the needs for static analysis, and so they are inappropriate for constructing high-integrity systems. The safety-critical community, where the use of high-integrity subset languages such as Spark is the norm, has broadly accepted this. Borrowing from Ross Anderson’s wellknown analogy,11 clearly, if you’re programming Satan’s computer, you should not use Satan’s programming language! The use of Spark in the CA In the CA, we used an information flow– centered software architecture. This maximizes cohesion and minimizes coupling between units. We carefully chose between Spark and Ada95 for each compilation unit, on the basis of the required separation between security-related functions in the system. Even though an Ada compiler 24
IEEE SOFTWARE
January/February 2002
processes both these languages, it was worth regarding them as separate languages for the purposes of the design. All Spark code had to pass through the Spark Examiner with no unjustified warnings or errors before any other review or inspection activity. This let reviewers focus on important topics (such as “Does this code implement the FTLS?”) rather than worrying about more trivial matters such as dataflow errors or adherence with coding standards. We used only the most basic form of annotation and analysis the Examiner offered. We did not perform proof of partial correctness of the code. We did carry out some proofs of exception freedom. Informally, the defect rate in the Spark code was extremely low. Spark programs have an uncanny habit of simply running the first time. The most significant difficulties arose in the system’s more novel parts, especially in areas that we had not formally designed or specified, such as the manipulation of Win32 named pipes, the database interface, and the handling of machine failures. Results Overall, the development has been successful. The number of system faults is low compared with systems developed using less formal approaches.12 The delivered system satisfies its users, performs well, and is highly reliable. In the year since acceptance, during which the system was in productive use, we found four faults. Of course, we corrected them as part of our warranty. This rate, 0.04 defects per KLOC, is far better than the industry average for new developments.12 Figure 2 shows the life-cycle phases where defects were introduced and where they were detected and removed. For example, 23 errors were introduced at the specification phase and removed during developer test. A good development method aims to find errors as soon as possible after they are introduced, so the numbers on the right of Figure 2 should be as small as possible. The delivered system contained about 100,000 lines of code. Overall productivity on the development—taking into account all project activities, including requirements, testing, and management—was 28 lines of code per day. The distribution of effort shows clearly that fault fixing constituted a
relatively small part of the effort (see Table 1); this contrasts with many critical projects where fixing of late-discovered faults takes a large proportion of project resources.
T
hree significant conclusions that we can draw from this work concern the use of COTS for secure systems, the practicality of formal methods, and the choice of programming language. You can build a secure system using insecure components, including COTS. The system’s overall architecture must guarantee that insecure components cannot compromise the security requirements. This resembles the approach taken to safety-critical systems. Using formal methods, as required by the higher levels of ITSEC and the Common Criteria, is practical. Our experience in this and other projects show that well-considered use of formal methods is beneficial. Of course, neither formal methods nor any other known method can completely eliminate defects. For example, we didn’t discover a few specification errors until user test. We can attribute these to incorrect formalization of the detailed requirements. Nevertheless, formal methods do reduce the number of late-discovered errors and, thus, the overall system cost. Similarly, Spark is certainly not a magic bullet, but it has a significant track record of success in the implementation of highintegrity systems. Spark, we believe, is unique in actually meeting the implementation requirements of the ITSEC and CC schemes. Spark’s support for strong static analysis and proof of program properties (for example, partial correctness or exception freedom) means that you can meet the CC requirements for formal development processes. The language subset’s simplicity and the data- and information-flow analysis offered by the Examiner make a large class of common errors simply impossible to express in Spark.
Acknowledgments We thank John Beric of Mondex International for his comments on an early draft of this article. The SPARC programming language is not sponsored by or affiliated with SPARC International and is not based on the SPARC architecture.
References 1. M. Hendry, Smartcard Security and Applications, 2nd ed., Artech House, Norwood, Mass., 2001. 2. Provisional Harmonised Criteria, version 1.2, Information Technology Security Evaluation Criteria (ITSEC), Cheltenham, UK, June 1991. 3. ISO/IEC 15408:1999, Common Criteria for Information Technology Security Evaluation, version 2.1, Int’l Organization for Standardization, Geneva, 1999; www.commoncriteria.org (current Nov. 2001). 4. J. Hammond, R. Rawlings, and A. Hall, “Will It Work?” Proc. Fifth Int’l Symp. Requirements Eng. (RE 01), IEEE CS Press, Los Alamitos, Calif., 2001, pp. 102–109. 5. S. Robertson and J. Robertson, Mastering the Requirements Process, Addison-Wesley, Reading, Mass., 1999. 6. D. Coleman et al., Object-Oriented Development: The Fusion Method, Prentice-Hall, Upper Saddle River, N.J., 1994. 7. J.M. Spivey, The Z Notation: A Reference Manual, 2nd ed., Prentice-Hall, Upper Saddle River, N.J., 1992. 8. J. Barnes, High Integrity Ada: The SPARK Approach, Addison-Wesley, Reading, Mass., 1997. 9. S. King et al., “Is Proof More Cost-Effective Than Testing?” IEEE Trans. Software Eng., vol. 26, no. 8, Aug. 2000, pp. 675–686. 10. CESG Computer Security Manual “F”: A Formal Development Method for High Assurance Systems, Communications Electronics Security Group, Cheltenham, UK, 1995. 11. R.J. Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems, John Wiley & Sons, New York, 2001. 12. S.L. Pfleeger and L. Hatton, “Investigating the Influence of Formal Methods,” Computer, vol. 30, no. 2, Feb. 1997, pp. 33–43.
Spark is certainly not a magic bullet, but it has a significant track record of success in the implementation of highintegrity systems.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
About the Authors Anthony Hall is a principal consultant with Praxis Critical Systems. He is a specialist in requirements and specification methods and the development of software-intensive systems. He has been a keynote speaker at the International Conference on Software Engineering, the IEEE Conference on Requirements Engineering, and other conferences. He has an MA and a DPhil from Oxford University. He is a fellow of the British Computer Society and a Chartered Engineer. Contact him at Praxis Critical Systems Ltd., 20 Manvers St., Bath BA1 1PX, UK; [email protected].
Roderick Chapman is a software engineer with Praxis Critical Systems, specializing in
the design and implementation of high-integrity real-time and embedded systems. He has also been involved with the development of the Spark language and its associated static-analysis tools. He received an MEng in computer systems and software engineering and a DPhil in computer science from the University of York. He is a member of the British Computer Society and is a Chartered Engineer. Contact him at Praxis Critical Systems Ltd., 20 Manvers St., Bath BA1 1PX, UK; [email protected].
January/February 2002
IEEE SOFTWARE
25
focus
building software securely
EROS: A Principle-Driven Operating System from the Ground Up Jonathan S. Shapiro, Johns Hopkins University Norm Hardy, Agorics, Inc.
he Extremely Reliable Operating System1 is a capability-based operating system designed to support the security and reliability needs of active systems. Users of active systems can introduce and run arbitrary code at any time, including code that is broken or even hostile. Active systems are shared platforms, so they must simultaneously support potentially adversarial users on a single machine at the same time.
T Design principles are highly advocated in software construction but are rarely systematically applied. The authors describe the principles on which they built an operating system from the ground up, and how those principles affected the design, application structure, and system security and testability.
26
IEEE SOFTWARE
Because active systems run user-supplied code, we cannot rely on boundary security to keep out hostile code. In the face of such code, EROS provides both security and performance guarantees (see www.eros-os.org for downloadable software). An application that executes hostile code (such as viruses) cannot harm other users or the system as a whole and cannot exploit a user’s authority so as to compromise other parts of the user’s environment. The EROS project started as a cleanroom reimplementation of KeyKOS,2 an operating system Norm Hardy and his colleagues created for the IBM System/370 (see www.cis.upenn.edu/˜KeyKOS for earlier documents from the KeyKOS system). The key contributions of EROS are formal verification of some of the architecture’s critical security properties and performance engineering. These security and performance capabilities come from two sources. First, the primary system architecture is
January/February 2002
uncompromisingly principle-driven. Wherever a desired feature collided with a security principle, we consistently rejected the feature. The result is a small, internally consistent architecture whose behavior is well specified and lends itself to a careful and robust implementation. Second, the system’s lead architects had prior experience as processor architects. This helped us avoid certain kinds of abstraction that modern operating systems generally include and seek a design that maps directly onto the features that modern hardware implementations provide; very little performance is lost in translating abstractions. Figure 1 shows the core EROS design principles. Jeremy Saltzer and Michael Schroeder first enumerated many of these in connection with the Multics project3 and incorporated others based on our experiences from other projects (see the “Related Work” sidebar). There are no magic bullets in the principles we adopted for EROS. The system’s perform0740-7459/02/$17.00 © 2002 IEEE
Principles from the Multics Project ■ ■
ance and design coherency results solely from finding better ways to adhere consistently to these principles at fine granularity without sacrificing performance. We maintained strong adherence to design principles in the EROS/KeyKOS design for three reasons: ■
■
■
We wanted to know that the system worked and why it worked. Unless you can trace each piece of the system code back to a motivating principle or a necessary correctness constraint, achieving this is difficult. Traceability of this type is also required for high-assurance evaluation. We expected that a clean design would lead to a high-performance implementation. Based on microbenchmarks, this expectation has been validated.1 We wanted to formally and rigorously verify some of the security mechanisms on which the system relies. A rigorous verification of the EROS confinement mechanism, which is a critical security component in the system, was recently completed.4
This article provides some examples of how these principles affected the EROS system design. We also describe the application structure that naturally emerged in the resulting system and how this affected the system’s security and testability. The EROS kernel design The most direct impact of design principles in EROS is in the kernel’s structure and implementation. In several cases, our strict adherence to design principles led to unusual design outcomes, some of which we discuss here. (Except where made clear by context, references to the EROS system throughout the article refer interchangeably to both EROS and KeyKOS.) Safe restart In secure systems, we must ensure that the system has restarted in a consistent and secure state. In most operating systems, there is an initial set of processes that the kernel specially creates. These processes perform consistency checks, reduce their authorities to their intended steady-state authority, and then initiate the rest of the programs in the system. This creates two problems:
■ ■ ■ ■
Economy of mechanism: Keep the design as simple as possible. Fail-safe defaults: Base access decisions on permission rather than exclusion. Complete mediation: Check every access for authority. Open design: The design should not be secret. (In EROS, both design and implementation are public.) Least privilege: Components should have no more authority than they require (and sometimes less). Least common mechanism: Minimize the amount of shared instances in the system.
Commonly accepted principles ■ Separation of policy and mechanism: The kernel should implement the mechanism by which resource controls are enforced but should not define the policy under which those controls are exercised. ■ Least astonishment: The system’s behavior should match what is naively expected. ■ Complete accountability: All real resources held by an application must come from some accounted pool. ■ Safe restart: On restart, the system must either already have, or be able to rapidly establish, a consistent and secure execution state. ■ Reproducibility: Correct operations should produce identical results regardless of workload. Principles specific to EROS ■ Credible policy: If a security policy cannot be implemented by correct application of the system’s protection mechanisms, do not claim to enforce it. ■ No kernel allocation: The kernel is explicitly prohibited from creating or destroying resources. It is free, however, to use main memory as a dynamic cache for these resources. ■ Atomicity of operations: All operations the kernel performs are atomic—either they execute to completion in bounded time, or they have no observable effect. ■ Relinquishable authority: If an application holds some authority, it should (in principle) be able to voluntarily reduce this authority. ■ Stateless kernel: The system’s security and execution state should logically reside in user-allocated storage. The kernel is free to cache this state. ■ Explicit authority designation: Every operation that uses authority should explicitly designate the source of the authority it is using. Figure 1. Core EROS
1. The consistency checks are heuristic, design principles. which makes establishing their correctness difficult. The Unix fsck command, for example, must decide which files to throw away and which to keep without knowing how these files interrelate. Consequently, the state of the group and password files might not be consistent with each other. January/February 2002
IEEE SOFTWARE
27
Related Work Henry Levy1 and Ed Gehringer2 provide overviews of several capability systems. EROS borrows ideas directly from three prior capability systems. Like Hydra,3 EROS is an extensible capability system. Programs can implement new objects that protected capabilities invoke. Like CAL/TSS,4 EROS unifies processes with protection domains. EROS designers also took to heart most of the design lessons reported from the CAL/TSS project. The Cambridge CAP computer,5 while implemented in hardware, similarly used fine-grain capabilities for memory protection. It’s also the first example of a stateless kernel. EROS uses kernel-protected capabilities. An alternative Amoeba6 uses treats capabilities as data, using unguessably sparse allocation for protection. This approach does not support confinement, because it is impossible to determine which bits of the application represent data and which represent capabilities. Simple cryptographic or signature schemes share this problem. One solution is password capabilities as used in Monash7 and Mungi,8 which apply a system-defined XOR before accepting capabilities. A concern with this approach is that any operation simple enough to be efficient (such as XOR) is easily reverse-engineered. True cryptographic checks must be cached to avoid prohibitive computational cost.
References 1. H.M. Levy, Capability-Based Computer Systems, Digital Press, 1984. 2. E.F. Gehringer, Capability Architectures and Small Objects, UMI Research Press, Ann Arbor, Mich., 1982. 3. W.A. Wulf, R. Levin, and S.P. Harbison, HYDRA/C.mmp: An Experimental Computer System, McGraw Hill, New York, 1981. 4. B.W. Lampson and H.E. Sturgis, “Reflections on an Operating System Design,” Comm. ACM, vol. 19, no. 4, May 1976, pp. 251–265. 5. M.V. Wilkes and R.M. Needham, The Cambridge CAP Computer and its Operating System, Elsevier, North Holland, 1979. 6. A.S. Tannenbaum, S.J. Mullender, and R. van Renesse, “Using Sparse Capabilities in a Distributed Operating System,” Proc. 9th Int’l Symp. Distributed Computing Systems, IEEE Press, Piscataway, N.J., 1986, pp. 558–563. 7. M. Anderson, R. Pose, and C.S. Wallace, “A Password Capability System,” The Computer J., vol. 29, no. 1, 1986, pp. 1–8. 8. G. Heiser et al., “Mungi: A Distributed Single Address-Space Operating System,” Proc. 17th Australiaision Computer Science Conf., ACM Press, New York, 1994, pp. 271–280.
Node capability Node
0
31
Node
Node 0
31
Pages Page capability
Figure 2. An EROS address space. 28
IEEE SOFTWARE
January/February 2002
Void capability Node capability
0
31
2. The initial processes receive their authority by means that are outside the normal mechanisms of granting or transferring authority. The designers must make specialized arguments to demonstrate that the system appropriately manages and diminishes this authority. The complexity of these arguments is comparable to the complexity of the correctness arguments for the remainder of the system. EROS resolves both issues by using a transacted checkpointing system. The system periodically takes an efficient, asynchronous snapshot of the entire state of the machine, performs a consistency check on this state, and then writes it down as a single disk transaction. Because the system is transacted as a whole, no possibility of global inconsistency exists. On restart, the system simply reloads the last completed transaction. System installation consists of writing (by hand) an initial system image; the processes of this system image have no unusual authority. Stateless kernel EROS is a stateless kernel—the system’s execution state resides in user-allocated storage. The kernel achieves performance by caching this state. A caching design facilitates checkpointing and imposes a dependency tracking discipline on the kernel. To ensure that user-allocated storage always reveals correct values when examined, the kernel must be able to restore this state on demand. These dependencies provide a form of self-checking. The kernel can sometimes compare its cached state to the user state to detect whether the runtime kernel state has become inconsistent, preventing a bad state from transacting to disk. EROS does not publish a memory map abstraction, because this would violate the stateless kernel principle. Instead, EROS requires that the applications explicitly allocate all of the pieces that comprise the mapping structure. Figure 2 shows a small EROS address space. The application explicitly allocates (typically by a user-level fault handler) every node and page in this address space. The kernel builds the hardware-memory-mapping tables by traversing this structure and caching the results in the hardware-mapping tables.
Table 1 Protection properties of capabilities Complete mediation In EROS, resources include pages of memory, nodes (fixed-size arrays of capabilities), CPU time, network connections, and anything that is built out of these. Every individual page, node, or other resource is named by one or more capabilities. Each capability names an object that is implemented by the kernel or another process in a separate address space. Capabilities are the only means of invoking operations on objects, and the only operations that can be performed with a capability are the operations authorized by that capability. This means that every resource is mediated and fully encapsulated. In most cases, a client cannot distinguish between system objects and objects that the server software implements. We can thus view an EROS system as a single large space of protected objects. Table 1 illustrates some of the key differences between capability systems and current conventional systems. Complete accountability Although many systems claim complete accountability as a goal, few actually implement it at the kernel level. Failures of kernel accountability commonly take two forms: ■
■
The kernel might fail to account for kernel metadata. Mapping metadata is particularly hard to account for, because there is no direct correlation between the number of pages mapped and the amount of required metadata on most hardware. The kernel might account for synthesized resources rather than real resources. A process consists of two nodes. Because they are not a fundamental unit of storage, EROS does not maintain a separate quota category for processes.
In EROS, all space-consuming resources are in terms of two atomic units of storage—nodes and pages—and these are the units that are accounted for. Applications explicitly perform all object allocations, and user-level fault handlers handle page faults. This is because a new page might need to be allocated to service the page fault, and the kernel can’t know the resource pool from which the new page should come.
Based on Lookup by Authority grant Name resolution
Conventional systems
Capability systems
(User, object) pair rights(Object, process.user) Program run by owning user can grant authority to anyone String lookup (via open)
Per-process capabilities rights(process[cap ndx]) Can transfer if an authorized path of communication exists Direct designation
Explicit designation In EROS, we can trace every operation a program performs to some authorizing capability. If a line of code performs an operation that modifies an object, the capability to that object is explicitly identified in the procedure call arguments. Because this is true, there is never any ambiguity about how and why the operation was permitted, and it becomes much harder for hostile clients to entice services into misusing their authority. Even the right to execute instructions is not innate—an application that does not hold a schedule capability does not execute instructions (least authority). Credible policy This principle might be restated as “bad security drives out good” and is best illustrated by example. A commonly desired security policy is, “Fred shouldn’t have access to this object.” Unfortunately, if program A has a capability letting it speak to program B, and A also has a capability to some resource R, then A is in a position to access R on behalf of B (that is, to act as a proxy). If two programs can communicate, they can collude. An Interface Definition Language (IDL) compiler can automatically generate the code to do so. The only way to really prevent Fred’s access is to isolate his programs completely from all other programs, which is generally not the policy that people want. Because of this, EROS does not attempt to prevent the transmission of capabilities over authorized channels. Security is not achieved by preventing this copy. EROS stops programs from colluding if there is no authorized communication path between them, but its goal is to ensure that such paths cannot arise. We have yet to identify a feasible security policy that cannot be implemented this way. Least astonishment For the most part, we can implement the principles shown in Figure 1 without conflict. One exception is the principle of least January/February 2002
IEEE SOFTWARE
29
User file space
Open/ save-as tool
Word processor application container
(Trusted Computing Base) Interface
Non-TCB Interface
Edit buffer view Non-TCB Interface
Shell
Non-TCB Interface
Window system
Figure 3. Components connected by capabilities.
astonishment, which is violated in the capability invocation specification. If a process specifies an undefined address as the destination of an incoming data string, the kernel will truncate the message rather than let the fault handler for that region run. The problem is that messages are unbuffered (as required by the stateless kernel principle), the fault handler is untrusted, and the process sending the message might be a shared service. A denial-of-service attack against the service can be constructed by providing a fault handler that never returns. The kernel therefore truncates the message rather than risk a denial of service. This is astonishing to such a degree that one conference publication has cited it as a design flaw. On examination, there is a fundamental collision of principles in this area, and there are only three possible resolutions: buffering, timeouts, or truncation. Buffering violates several design principles (stateless kernel, least common mechanism, complete accountability, and no kernel allocation), and timeouts preclude repeatability under heavy load. So, given that a wellintentioned application is always in a position to provide a valid receive region, truncation appears to be the least offensive strategy for preventing denial of service. Component-based applications We now turn our attention to the structure of EROS applications. It is now hopefully clear that the facilities the EROS kernel directly provides are relatively low-level. Application code implements most of the system functions—even trusted functions. For example, the EROS kernel directly provides pages of disk storage but not a file system. The file abstraction is built entirely at the application level (separation of mechanism and policy), and the file application simply stores the file content in an address space, growing the address space as necessary
30
IEEE SOFTWARE
January/February 2002
to hold the entire file. The file application’s responsibility is to implement operations such as read and write that act on the file. The checkpoint mechanism provides stabilization. Because a distinct object implements each file, this implementation maintains the principle of least common mechanism. This design pattern—creating higherlevel functions by composing the underlying primitives of the operating system in reusable components—is the basic strategy for building EROS applications. A separate process implements each component instance, and the kernel provides a high-performance interprocess communication mechanism that enables these components to be efficiently composed. In fact, it is rare for EROS applications to manipulate kernelsupplied objects directly. Most applications reuse components that the system supplies or implement new components that provide a needed function in a structured way. This naturally leads programmers to apply the principle of least privilege in their application designs, because these components are designed to use only the capabilities they need. Application structure EROS applications are structured as protected, capability-connected components (see Figure 3). Each instance of a component runs with individually specified capabilities that define its authority. Capabilities are kernel protected, as are the objects they designate. The only operations that can be performed with a capability are the operations the object defines. Because of this combination of protection and mediation, an application that executes hostile code (such as a virus) cannot harm the system as a whole or other users and can’t exploit the user’s authority to compromise other parts of the user’s environment. Similarly, capabilities control access to resources, preventing hostile code from overconsuming resources and making the rest of the system unusable. In Figure 3, the word processor is factored into a container component and individual editing components. The container component has access to the user’s file system only through a trusted “open/save-as” dialog system, but the editing components have no access to the user’s file system. While the word processor has nontrusted
access to the window system, the open/save-as tool has access through a special, trusted interface. The window system decorates trusted components with different window decorations, letting users know that they are interacting with a component that has potentially sensitive authority. Testability and defense in depth Designing applications as compositions of small components simplifies testing. Due to complete mediation, each component can be invoked only through its intended interface. Because they tend to be small and well isolated, EROS components also tend to be easily tested. A well-written test suite can typically reproduce and test all the states that are actually reachable by client code. An IDL compiler commonly generates external interfaces to components, which largely eliminates the risk of buffer overrun attacks. Because each component has a well-defined, protected interface, it is often possible to deploy new component versions into the field and test them against real applications by running them side by side with the current working version and comparing the results. Mediated components and least privilege also make the propagation of viruses more difficult. Compromising any single component doesn’t really buy the attacker very much, because the component’s actions are restricted by the capabilities it can invoke. Assuming that an attacker does compromise some part of the system, he has no readily exploited communication path by which to expand the compromise. All his interactions with the rest of the system are constrained by the protocols defined at the capability boundaries. Unlike firewalls, which must operate at the network level with relatively little knowledge of the application state, the capability interfaces operate at the application level with narrowly defined componentspecific interfaces. This provides the system overall with a type of “defense in depth” that is difficult (perhaps impossible) to achieve in applications that are structured as a single, undifferentiated address space. Contrast this with current systems. Once something compromises a piece of an application, the entire application is compromised. As a result, the virus gains all the authority that the application holds—even if
the original application didn’t actually use that authority. A Unix-based email reader has the authority to overwrite any file that the user can overwrite. The reader doesn’t do this because the program is well-behaved, but when a virus takes over the email reader, it can run any code that it wishes, usually with the full authority of the user running the application. In a capability system, this is not true. Constructors Closely related to the EROS component model is a generic system utility component called the constructor. When a developer writes code for a new component, she needs some mechanism to instantiate new copies of this component. This is the constructor’s job. There is a distinct constructor for each type of object in the system. To instantiate an object, the client invokes a capability to its constructor. The constructor’s second, more important task is to prevent information leakage. One of the key questions that a programmer would like to ask about a component is, “If I were to create one of these components and give it some vital piece of information, could the information be disclosed without my permission?” The constructor can determine whether the component it creates is “leak free.” This is possible because all of a component’s possible actions are determined by the capabilities that the component initially holds. If these initial capabilities are (transitively) read-only, then the only way the component can communicate is by using capabilities supplied by the program creating the component. Such a component is said to be confined.5 As long as the creator is selective in giving capabilities to the new component, information cannot leak. Because the constructor creates the component in the first place, it is in a position to know all the capabilities that the component holds and therefore can certify the component’s safety. In spite of its security features, the constructor creates new processes very quickly. In practice, we find that programmers use constructors as the generic program instantiation tool for all programs (whether or not they are confined). Surprisingly, the “all capabilities must be transitively read-only” restriction is almost
Because they tend to be small and well isolated, EROS components also tend to be easily tested.
January/February 2002
IEEE SOFTWARE
31
The use of capabilities and transparent persistence distinguish EROS from most other operating systems.
always enough. To date, the only applications we have seen that can’t be straightforwardly built under this restriction are things like networking subsystems. The network subsystem needs access to external devices, and because of this, it is necessarily a potential source of information leakage. The whole point of a network, after all, is to leak information. Leaky programs aren’t inherently bad, but they must be carefully examined. The constructor is therefore the key to safely executing untrusted code. If untrusted code is executed within a confinement boundary, it can’t communicate with the rest of the system at large. Although resource attacks (on the CPU or space, for example) are possible, we can restrict both the CPU time and space allocated to a confined subsystem. This means, for example, that a Web browser might be designed to instantiate a new HTML rendering component for each page. Within this component, it is perfectly safe to run untrusted scripting code, because the component as a whole is confined. The scripting code therefore does not have access to anything sensitive. Costs and benefits The use of capabilities and transparent persistence distinguish EROS from most other operating systems. Although component-based designs are well accepted, they require restructuring the application. Protection carries an associated performance overhead, so it is reasonable to ask what this design approach costs. Adapting applications Even if compatibility environments for existing applications can be constructed (a binary compatible Linux environment is in progress), EROS imposes a significant cost in development effort. To gain advantage from the underlying kernel’s security properties, we must refactor critical applications into components. The most critical of these applications are external interfaces, such as SMTP, LDAP, HTTP, and FTP servers. These services run with a great deal of authority and use security-critical code. When completed, the EROS system will ship with all these services. After servers, the next most important category is applications that execute active content such as scripting languages: browsers,
32
IEEE SOFTWARE
January/February 2002
email agents, and word processors. In current software, the refactoring points in these applications often already exist and are easily modified. Word processors, for example, typically open files for writing only after putting up some form of a file dialog box. Modifying the dialog box mechanism to be a protected subsystem and return an open descriptor rather than a string would go a long way toward eliminating macro viruses. Comparable protection can’t be achieved by access control lists—in an access control list system, the application runs with the same authority as the user. Trusting the user interface The preceding discussion glosses over an important point. As a user, how do I know that I am talking to the real file dialog box? This is a trusted user interface design issue, and although work has been done on this, it isn’t a simple problem. A capability-based design helps, because, for example, the window system can implement distinguished trusted and untrusted interfaces (see Figure 3) and decorate trusted windows in a uservisible way. Because capabilities are unforgeable, an untrusted application cannot contrive to appear trustworthy. In two short sentences, we have reduced the problem of application security to properly designing the file dialog and ensuring the operating system’s security and trustworthiness, which is something we can solve. In a capability system, this type of mechanism is readily enforceable. If the installer doesn’t give the application trusted access to the window system, there is no way that the application can forge a trusted dialog box. Similarly, if the only access to the user file system is provided through the file dialog tool, the application has no means to bypass the user’s consent when writing user files. Performance Current performance measurements for EROS are based on microbenchmarks.1 The results are limited but encouraging. Process creation in EROS, for example, involves five components: the requesting application, a constructor, a process creator, a storage allocator, and the newly created components. In spite of this highly decomposed design, the EROS process creation mechanism is three times faster than the Linux fork and
About the Authors
exec mechanism. Page faults and mapping management in EROS are over 1,000 times faster than Linux. This is not a noticeable source of delay in typical Linux applications, but it is an absolutely critical performance issue in component systems. Because EROS does not yet have a Unix emulator, it is difficult to directly compare applications. KeyKOS included a binarycompatible Posix implementation that was directly comparable with the performance of the Mach-based Unix implementation.6 We expect that the EROS compatibility implementation will do significantly better.
I
t is difficult to measure how much of the testability and performance of the EROS family is due to principles versus careful implementation and design. Probably the clearest impact of principles on the design results is from the accountability principle, because it has forced us as architects to think carefully about resource manipulation and protection. In terms of impact, security principles run a close second. Whether EROS will ultimately be successful remains to be seen, but the EROS family has achieved something fairly unusual: a verified security architecture with a running, high-performance implementation. As a result, EROS is currently being evaluated for incorporation into various commercial consumer devices. EROS is also being evaluated for reliabilitycritical services such as lightweight directory access protocol implementations and Web servers. Two anecdotal facts are encouraging indicators: it has been well over eight years since we have found an EROS kernel bug that an assertion check didn’t catch. This suggests that the principle-driven design has helped us build a more reliable system by letting us check for errors effectively. The Systems Research Laboratory at Johns Hopkins University is building a second version of EROS, restructured to support realtime and embedded applications. We anticipate seeking EAL7 assurance evaluation—the highest level currently defined—for this system under the Common Criteria process.7 We have also observed that programmers using EROS develop their programs in a qualitatively different way than, say, Unix
Jonathan Shapiro is an assistant professor in the Department of Computer Science at Johns Hopkins University. His research interests include computer security, operating systems, and development tools. He received a PhD in computer science from the University of Pennsylvania. Contact him at [email protected].
Norm Hardy is a senior architect at Agorics, Inc. His research interests include operating systems, security, and programming languages. He received a BS in mathematics and physics from the University of California at Berkeley. Contact him at [email protected].
developers. The system architecture encourages them to factor applications into manageable pieces, and the protection boundaries help make these pieces more testable. There are other features of the system that encourage this as well—most notably, the event-driven style of component code. EROS is the first system in the EROS/KeyKOS family that has been exposed to a significant number of programmers. It is still too early for a list of design patterns to clearly emerge. It is striking, however, that students can master the system and build applications quickly, even though various simplifying abstractions are not provided. The greatest practical impediment to learning seems to be abandoning their Unix-based assumptions about how processes work. Often, we find that they ask questions such as, “How do I duplicate this Unix functionality?” when they can achieve their real objective more simply using the mechanisms provided. References 1. J.S. Shapiro, J.M. Smith, and D.J. Farber, “EROS: A Fast Capability System,” Proc. 17th ACM Symp. Operating Systems Principles, ACM Press, New York, 1999, pp. 170–185. 2. N. Hardy, “The KeyKOS Architecture,” Operating Systems Rev., vol. 19, no. 4, Oct. 1985, pp. 8–25. 3. J.H. Saltzer and M.D. Schroeder, “The Protection of Information in Computer Systems,” Proc. IEEE, vol. 9, no. 63, 1975, pp. 1278–1308. 4. J.S. Shapiro and S. Weber, “Verifying the EROS Confinement Mechanism,” Proc. 2000 IEEE Symp. Security and Privacy, IEEE Press, Piscataway, N.J., 2000, pp. 166–176. 5. B.W. Lampson, “A Note on the Confinement Problem,” Comm. ACM, vol. 16, no. 10, 1973, pp. 613–615. 6. A.C. Bomberger et al., “The KeyKOS Nanokernel Architecture,” Proc. Usenix Workshop Micro-Kernels and other Kernel Architectures, Usenix, San Diego, 1992, pp. 95–112. 7. Common Criteria for Information Technology Security, ISO/IS 15408, Int’l Standards Organization, Final Committee Draft, version 2.0, 1998.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
January/February 2002
IEEE SOFTWARE
33
focus
building software securely
Composing Security-Aware Software Khaled M. Khan, University of Western Sydney Jun Han, Monash University
he resurgence of component-based software development offers software engineers new opportunities to acquire third-party components to deliver system functionality. Component-based software engineering represents the concepts of assembly and coupling of components—essential to most engineering disciplines. The development paradigm of coupling and decoupling software components is receiving a great deal of interest from industry and academia, as it promises maximum
T This article introduces a component security characterization framework that exposes security profiles of software components to inspire trust among software engineers. 34
IEEE SOFTWARE
benefits of software reusability. While software components have become popular, security concerns are paramount. Their composition can be considered risky because of the “plug and play” with unknown third-party components. In dynamic runtime applications for critical systems such as e-commerce and ehealth, the risk could be much higher. Component security concerns are twofold: how to build secure components and secure composite systems from components, and how to disclose components’ security properties to others. This article addresses the latter; rather than propose any new security architecture, we present a security characterization framework. Our approach concerns the security functions of software components by exposing their required and ensured security properties. Through a compositional security contract between participating components, system integrators can
January/February 2002
reason about the security effect of one component on another. A CSC is based on the degree of conformity between the required security properties of one component and the ensured security properties of another. However, whether the characterized and disclosed security properties suffice to build a secure composite system is outside the scope of this parameter. System integrators should address this concern at the time of composition. Component mistrust There is now an open challenge on how to cultivate and inspire software developers’ much-needed trust in third-party components. The attributes that most affect a trust relationship are identity, origin, and security properties that components offer to and require from other components. If the developer doesn’t know these attributes during system integration, the component might not 0740-7459/02/$17.00 © 2002 IEEE
be trustworthy. In current practice, the trustrelated attributes are often neither expressed nor communicated. Software developers are reluctant to trust a third-party software component that does not tell much about its security profile. Despite these shortcomings, software engineers are still inclined to use them to minimize development effort and time. Today, trust in an application system is based on consent—that is, the user is explicitly asked to consent or decline to use a system.2 At the application level, such consentbased trust perhaps works fine. But in a component-based development environment, universally shallow commitment regarding component security is dangerously illusive and can trigger costly consequences. Trust requirements in a development environment significantly differ from those of application users. Component security— based on various nondeterministic elements such as the use domain, magnitude of the hostility in the use context, value of the data, and other related factors—is relative, particularly in a component-based development environment. Therefore, software engineers must be assured with more than just a component security or insecurity claim. Whatever small role a component plays, the software engineer cannot rule out its possible security threats to the entire application. Component developers might not be aware of the security requirements of their products’ potential operational contexts. Software engineers do not expect such knowledge from the component developer, but they do expect a clear specification of the component security requirements and assurances.1 This information should be made available if queried at runtime. Developers must be able to do runtime tests with candidate components to find possible security matches and mismatches. The major concern—the disclosure of components’ security properties and security mismatches of those properties—has received little attention from the security and software engineering research communities. Current practices and research for security of component-based software consists of several defensive lines such as firewalls, trusted operating systems, security wrappers, secure servers, and so on. Some significant work on component testing, component assurances, and security certification has been done, particu-
larly in the last two years.3-5 These efforts basically concentrated on how to make a component secure, how to assure security using digital certification, and how to maximize testing efforts to increase the quality of individual components. Undoubtedly, such work is important to inspire trust, but we must explore other possibilities that would let software engineers know and evaluate the actual security properties of a component for specific applications. Our approach Driven by these ideas and motivations, we propose a security characterization framework in this article. The framework addresses how to characterize the security properties of components, how to analyze at runtime the internal security properties of a system comprising several atomic components, how to characterize the entire system’s security properties, and how to make these characterized properties available at runtime. To inspire trust in a particular composite system, a component’s security contract with all the other components, the security provisions that each component requires from ensures to the others, and the ultimate global security profile of the entire federated system should be clear.
There is now an open challenge on how to cultivate and inspire software developers’ much-needed trust in third-party components.
Atomic components Security properties and behaviors of a software system are categorized into 11 classes in ISO/IEC-15408 Common Criteria.6 These classes are made of members, called families, based on a set of security requirements. We will only discuss a subset of one such security class, user data protection, just to give a snapshot of our characterization framework. The publishable security properties related to user data protection of any atomic component can be categorized as required— a precondition that other interested parties must satisfy during development to access the ensured security services—or ensured— a postcondition that guarantees the security services once the precondition is met. Security properties are typically derived from security functions—the implementation of security policies. And the security policies are defined to withstand security threats and risks. A simple security function consists of one or more principals (a principal can be a January/February 2002
IEEE SOFTWARE
35
The final and important issue in a security characterization framework is how to make components’ security profiles and CSCs available to other components.
human, a component, or another application system, whoever uses the component), a resource such as data, security attributes such as keys or passwords, and security operations such as encryption. Based on these, three main elements characterize an ensured or required security property: security operations executed by the components to enforce security properties, security attributes required to perform the operation, and application data manipulated in a compositional contract.7 Using these elements, we can formulate a simple structure to characterize the security requirements and assurances of individual components: ƒ(Oi, Kj, Dk) where ƒ represents a security objective formed with three associated arguments; O is the security-related operation performed by the principal i in a compositional contract; K is a set of security attributes used by the principal; subscript j contains additional information about K such as key type, the key’s owner, and so on; D is an arbitrary set of data or information that is affected by the operation O; and the subscript k contains additional information regarding D such as whether a digital signature is used or not. The following examples represent a required security property R (protect_in_data) and an ensured security property E (protect_out_data) of a component P: RP = protect_in_data (encryptQ, keyP+, ‘amount’) EP = protect_out_data (encryptP, keyQ+, file1P−.digi_sign). In this example, component P’s required property RP states that the data is to be encrypted by any component Q with component P’s public key. A plus sign (+) after P denotes public key. The ensured property EP states that component P encrypts the data file with the public key of any component Q. The data is also digitally signed by P with its private key, denoted by the minus sign (−) after P. This format is specific to a particular type of security function related to user data protection. This notation, or a similar one, can be standardized for all components. However, alternative structure might need to be
36
IEEE SOFTWARE
January/February 2002
formulated to represent other security classes such as authentication, security audit, trusted path, privacy, and so on. Analysis of security contracts A component that broadcasts an event to receive a service is called a focal component. Software components that respond to the event are usually called candidate components, and they might reside at different remote locations. With the security characterization structure of atomic components previously explained, a CSC between two components such as x and y can be modeled as Cx,y = (Ey ⇒ Rx) ∧ (Ex ⇒ Ry) where C is a compositional security contract between focal component x and candidate component y. The expression E ⇒ R denotes that E implies R, or E satisfies R. The required or ensured security property of an existing CSC can be referred to as Cx,y.Ry or Cx,y.Ex respectively. The degree of conformity between the required security properties of one component and the ensured security properties of another is the ultimate CSC of the composite system. Global security characterization As is the case of atomic components, we also need to establish a global security characterization of a composite system, because it might be used in further composition as a component. In fact, developers often view this kind of system as a single entity or an atomic component, not as a collection of components in such further components. Therefore, the CSC approach can specify the composite component security by required and ensured properties based on the structure and principles defined for the atomic component. However, such a characterization depends on the functionality that the composite system provides. We can derive the externally visible security properties—that is, the required and ensured properties—of a composite component based on CSCs among participating components. Notion of an active interface The final and important issue in a security characterization framework is how to make components’ security profiles and
Figure 1. Structure of an active interface.
A focal component A candidate component
A candidate component
Component ID: Operation, arguments Security KB Required: R Ensured: E CSC base1 (next)
A candidate component
A candidate component
CSC base2 (next) Multiple components composition
CSC basen (next)
■
■
Functionality offered. (read-only public properties) Atomic component's security characterization, visible to any external entities. (read-only public properties) The CSC base structure is dynamic, it grows and shrinks. Each CSC base is visible only to the participating components. (read-write protected properties)
Each CSC base is structurally connected to another CSC in a chain using pointers.
CSCs available to other components. This article extends the CSC model to make it active, in the sense that each interface between components will have certain reasoning capability. We call them active interfaces. Current frameworks for software component models such as EJB, Corba, COM, and .Net are limited to the specification and matching of structural interface definitions. 8 Interface description languages (IDLs) deal with the syntactic structure of the interface such as attributes, operations, and events. In our approach, an active interface not only contains the operations and attributes to serve a function but also embodies the security properties associated with a particular operation or functionality. An active interface supports a three-phase automatic negotiation model for component composition: ■
Externally visible and verifiable. (read-only public properties)
A component publishes its security properties attached with a functionality to the external world. The component negotiates for a possible CSC at runtime with other interested candidate components. If it succeeds, the negotiation results are used to configure and reconfigure the composition dynamically.
Active interface structure An active interface consists of a component identity, a static interface signature, a static (read-only) security knowledge base of the component, and a (read–write) CSC
base that is dynamic based on the information available from the security knowledge base.7 Before a component is available for use, a certifying authority must certify it. A certificate ensures that the implementation matches the published functionality and the exposed security properties. It is argued that software components can only be tested and certified individually—not within the context of the complete composite system.8 The certified assurances must be verifiable statically and dynamically. Figure 1 illustrates a skeleton of an active interface structure. The ComponentID in the active interface includes a unique identity (UID) provided by a certifying authority, the component’s current residing address (URL), details about the component developer, and the certification authority that certified the component: ComponentID (uid, URL, developer_ID, certificate) A certifying authority will verify, certify, and digitally stamp all of this data.5 It can further reveal more identity information if queried about the certificate, certification stamp, validity period, and so on. All identity and certification information is readonly and public—only the certifying authority can alter it. Operations and arguments An interface signature consists of operations and attributes for a particular functionality. These operations and attributes January/February 2002
IEEE SOFTWARE
37
If the component needs to alter its security properties, it requires a new certificate after the recompilation.
are used for structural plug-and-play matching. These properties are static— read-only properties. Components cannot make any modification to this. This interface is intended to make a structural match before two components are composed. Security knowledge base A security knowledge base stores and makes available the security properties of a component in terms of ƒ(Oi, Kj, Dk). The required and ensured properties stored in this KB are specific to the functionality that the component offers. These properties must be based on the actual security functions that the component uses to accomplish a particular functionality. A component might offer various functions, so the exposed security properties can vary accordingly. Once the information is stored in a KB and certified, no other entities can alter its content. Any recompilation of the certified component would automatically erase all certification and identity information stored in ComponentID. If the component needs to alter its security properties, it requires a new certificate after the recompilation. CSC base A binary executable piece of code residing in the active interface of the focal component generates CSC conformity results between the focal component and a candidate component. If the system identifies nonconformance between the required and ensured properties it concludes with a security mismatch. The resulting CSC is automatically stored in the CSC base of the focal component, and remains there as long as the composition is valid.7 Also, a component can accept a partially or completely mismatched CSC, although this might have negative security effects on the global system. If a component becomes obsolete or is no longer needed in a dynamic composition, the associated obsolete CSC might be stored in a log belonging to the focal component for future audit purposes, but it would not be available to any of the participating components. An example We use a fictitious distributed-system topology as an example of how our proposed active interface would work in a distributed
38
IEEE SOFTWARE
January/February 2002
environment. Consider an e-health care system that regards all clinical information passing among the stakeholders, such as the general practitioners, specialists, patients, and pharmacists, as confidential. Assume a focal component Y running on a machine at a GP’s office connects with a trusted candidate component S chosen from among many such systems running at various specialists’ offices. Y provides a patient’s diagnosis report to S to get a prescription. After receiving the prescription from S, Y sends it electronically to a candidate component P residing on a pharmacist’s system for a price quotation. Developers would independently develop many such Ps and Ss and make them available from their various distributed sources, potentially able to deliver the functionality that Y wants. However, component Y not only is interested in specific functionality but also wants to know upfront the security properties that those components provide. Binary CSC Assume component Y exposes the following required and ensured security properties: SECURITY { REQUIREDY {RY = protect_in_data (encryptS, keyS-, ‘prescription’S.digi_sign)} ENSUREDY {EY = protect_out_data (encryptY, keyS+, ‘diagnosis’)}}.
The ensured property states that Y will provide a diagnosis report of a patient to a specialist component. Y would encrypt (encryptY) the report with S’s public key (keyS+). In return, Y requires from S that S digitally sign (‘prescription’S.digi_sign) and encrypt (encryptS) the prescription it sends with its own private key (keyS-). Now assume that in response to the event Y broadcasts for the functionality Get_prescription, it receives responses from components S1 and S2 offering that functionality. S1 and S2 run on different machines for different specialists; they are independently developed and serviced by different developers and have their own security requirements and assurances. Y also reads the certification information, origin, and identity of the components from the interfaces of S1 and S2. Y first queries S1. S1’s interface exposes its security properties
stored in its static KB as SECURITY { REQUIREDs1 {Rs1 = protect_in_data (encryptY, keys1+, ‘diagnosis’)} ENSUREDs1 {Es1 = protect_out_data (encrypts1, keyY+, ‘prescription’)}}.
According to these security properties, component S1 requires that component Y encrypt the diagnosis report with S1’s public key. In return, S1 would encrypt the prescription with Y’s public key, but S1 would not digitally sign the prescription data. Y’s active interface now generates the CSC between Y and S1 based on CY,S = ((EY ⇒ RS1) ∧ (ES1 ⇒ RY)). In the generated CSC, S1’s ensured security property has not fully satisfied Y’s required security property, because S1 does not provide the digital signature with the prescription, as Y requires. After making a similar query to S2, Y reads S2’s disclosed security properties as SECURITY { REQUIREDS2 {RS2 = protect_in_data (encryptY, keyS2+, ‘diagnosis’)} ENSUREDS2 {ES2 = protect_out_data (encryptS2, keyS2-,’prescription’S2.digi_sign)}}.
Component S2 requires Y to encrypt the diagnosis report with S2’s public key. In return, S2 ensures that it would digitally sign (‘prescription’S2.digi_sign) and encrypt the prescription with its private key. Y can decrypt the message using S2’s public key to verify the signature. Based on these security properties, the generated CSC is consistent with the requirements of Y and S2. Y can finally be combined with S2. The resulting properties are now stored in Y’s CSC base1 for future reference. Transitive CSC We extend the same scenario further to examine how our framework can support a transitive composition with multiple components. A transitive CSC occurs when two
components are composed in an extended sense—that is, when a CSC between two components is influenced by the ensured or required property of a third component or even by another existing CSC. After Y has combined with S2, it then looks for a third component P that would provide a price quotation for the prescription produced by S2. Y’s security properties for this particular functionality are SECURITY { REQUIREDY {RY = protect_in_data (encryptP, keyP-, ‘price’P.digi_sign)} ENSUREDY{EY = protect_out_data (encryptY, keyP+, CY, S2.ES2.(prescription)S2.digi_sign)}}.
A component can accept a partially or completely mismatched CSC, although this might have negative security effects on the global system.
According to these properties, Y agrees to attach S2’s digital signature (CY,S2.ES2. (prescription)s2.digi_sign) to a component P to ensure that a specialist authenticates the prescription. In return, Y requires that P digitally sign and encrypt the price data. Note that these security properties of Y are quite different from those for the specialist prescription. Now assume that in response to Y’s broadcasting a request for a price quotation, remote components P1 and P3 have registered their interests in providing the functionality that Y wants. P1 and P3 are developed and serviced by two different development organizations and have their own security requirements and assurances. Y now runs a security test with P1 to verify whether the component could deliver the functionality as well as the security that Y requires. It also verifies whether Y by itself could satisfy P1’s required property. Y reads the security properties exposed by P1 as SECURITY { REQUIREDP1 {RP1 = protect_in_data (encryptY, keyY-, (CY,S.ES)Y.digi_sign)} ENSUREDp1 {EP1 = protect_out_data (encryptP1, keyP-, ‘price’P1.digi_sign)}}.
Y’s interface generates a CSC. It shows that P1’s ensured security property has satisfied Y’s required property, but that Y has not satisfied P1’s required property. This is because P1 requires a digital signature from January/February 2002
IEEE SOFTWARE
39
Figure 2. A composite system with two compositional contracts.
ComponentID:S1
ComponentID:P1
makeReport(..)
quotePrice(..) Failed in security test
Failed in security test
Security KB
Security KB
Required: R Ensured: E
Required: R Ensured: E
CSC base (next)
CSC base (next) A candidate component
A candidate component
CSC between P 3 and Y
A focal component
CSC between Y and S2
ComponentID:Y Operation, arguments
A candidate component
A candidate component
Security KB
ComponentID:S2
Required: R Ensured: E
makeReport(..)
ComponentID:P3 quotePrice(..) Security KB Required: R Ensured: E CSC base (next)
Explicit composition with Y, but implicit composition with S2
CSC base1 Cy,s2 = (EY→Rs2)^ (ES2→Ry) (next) CSC base2 Cy,p3 = (EY→Rp3)^ (Ep3→Ry) (next)
Y, but Y does not have one. Thus, Y’s security test with P1 has failed. Y now makes another security query to P3. Y reads the exposed security properties of P3 as SECURITY { REQUIREDP3 {RP3 = protect_in_data (encryptY, keyP3+, Cy,S.ES(prescription) S.digi_sign)} ENSUREDP3 {EP3 = protect_out_data (encryptP3, keyP3-, ‘price’P3.digi_sign)}}.
Y’s interface now generates the CSC with P3 based on CY,P3 = ((EY ⇒ RP3) ∧ (EP3 ⇒ RY)). The derived CSC is consistent with the required compositional contract between Y 40
IEEE SOFTWARE
January/February 2002
Security KB Explicit composition
Implicit composition with P3
Required: R Ensured: E CSC base (next)
and P3. Satisfied with this CSC, Y combines with P3 and stores the CSC in its CSC base2. Interestingly enough, the compositional contract CY,P3 involves three components in the relationship chain: CY,P3 = (((CY, encyrpt)
S2.ES2((prescription)S2.digi_sign) ⇒ RP3 ) ∧ (EP3 ⇒ RY)).
The entire system scenario is shown in Figure 2. There are two CSCs in this system: one between Y and S2 (shown by the red dotted line) and the other between P3 and Y (shown by the larger blue dotted line). In the latter composition, S2 is transitively composed with P3 because P3’s security requirements partly depend on S2’s security assurances, although P3 does not have any direct composition with S2. With the previous examples, we have demonstrated that software components
can know and reason about the actual security requirements and assurances of others before an actual composition takes place. The example also suggests that a security characterization is a mechanism to provide “informed consent.”2 An informed consent gives the participating entities explicit opportunity to consent or decline to use components after assessing the candidate components’ security properties.
O
ur framework’s main objective is to generate computational reflection to let components and their developers identify and capture the various security properties of the other components with which they cooperate. In such a setting, components not only read the metadescription of others’ security properties but also identify security mismatches between two components and evaluate composability realistically. Security characterization and third-party certification of components would mutually benefit each other: first, a security characterization would contribute significantly to the process of component security certification; second, certification would make the exposed security properties more creditable to software engineers. When required and ensured security properties are spelled out in simple, comprehensible terms, software engineers are better positioned to evaluate the strength of the security a component provides. They are also well informed about what to expect from and provide to the component to establish a viable composition. In a software engineering context, we must balance security against the other design goals of the entire component-based system. To achieve this, application developers must know about components’ security properties. A trusting profile could be gradually built and inspired on the basis of the participating components’ self-disclosure of their security properties. The security properties built into a component represent the efforts already put into place to withstand certain security threats. However, the real protection with the committed effort of the component from any security threat is beyond the control of the component. Whether the available resources disclosed by the component are sufficient to withstand a threat is outside the parameters of our framework. A trust-generating effort
could only be viable by exposing actual certified security properties of interested parties in a composition as opposed to “secure or insecure” claims. We acknowledge that software engineers’ trust in unfamiliar components is understandably difficult to cultivate and that complete trust is undoubtedly desirable, but we believe that our approach would at least contribute to such trust.
Acknowledgments The work reported here has benefited from earlier discussions with Yuliang Zheng while he was with Monash University. References 1. D. Carney and F. Long, “What Do You Mean by COTS?,” IEEE Software, vol. 18, no.2, Mar./Apr. 2001, pp. 82–86. 2. B. Friedman, P.H. Kahn Jr., and D.C. Howe, “Trust Online,” Comm. ACM, vol. 43, no. 12, Dec. 2000, pp. 34–44. 3. J. Voas, “Certifying Software for High-Assurance Environments,” IEEE Software, vol. 16, no. 4, July/Aug. 1999, pp. 48–54. 4. W. Councill, “Third-Party Testing and the Quality of Software Components,” IEEE Software, vol. 16, no. 4, July/Aug. 1999, pp. 55–57. 5. A. Ghosh and G. McGraw, “An Approach for Certifying Security in Software Components,” Proc. 21st Nat’l Information Systems Security Conf., Nat’l Inst. Standards and Technology, Crystal city, Vir., 1998, pp. 82–86. 6. ISO/IEC-15408 (1999), Common Criteria for Information Technology Security Evaluation, v2.0, Nat’l Inst. Standards and Technology, Washington, DC, June 1999, http://csrc.nist.gov/cc. (current Dec. 2001) 7. K. Khan, J. Han, and Y. Zheng, “A Framework for an Active Interface to Characterize Compositional Security Contracts of Software Components,” Proc. Australian Software Eng. Conf., IEEE CS Press, Los Alamitos, Calif., 2001, pp. 117–126. 8. J. Hopkins, “Component Primer,” Comm. ACM, Oct. 2000, vol. 43-10, pp. 27–30.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib. Khaled M. Khan is a lecturer in the School of Computing and Infor-
mation Technology at the University of Western Sydney, Australia. His research interests include software components, software maintenance, and software metrics. He received a BS and an Ms in computer science and informatics from the University of Trondheim, Norway, and another BS from the University of Dhaka, Bangladesh. He is a member of the IEEE Computer Society. Contact him at [email protected]. Jun Han is an associate professor in the School of Network Computing at
Monash University, Australia, where he directs the Enterprise and Software Systems Engineering and Technology Laboratory. His research interests include component-based software systems, software engineering tools and methods, and enterprise systems engineering. He received a BEng and MEng in computer science and engineering from the Beijing University of Science and Technology and a PhD in computer science from the University of Queensland. He is a member of the IEEE Computer Society. Contact him at [email protected].
January/February 2002
IEEE SOFTWARE
41
focus
building software securely
Improving Security Using Extensible Lightweight Static Analysis David Evans and David Larochelle, University of Virginia
uilding secure systems involves numerous complex and challenging problems, ranging from building strong cryptosystems and designing authentication protocols to producing a trust model and security policy. Despite these challenges, most security attacks exploit either human weaknesses—such as poorly chosen passwords and careless configuration—or software implementation flaws. Although it’s hard to do much about human frailties, some help is available through
B Security attacks that exploit well-known implementation flaws occur with disturbing frequency because the software development process does not include techniques for preventing those flaws. The authors have developed an extensible tool that detects common flaws using lightweight static analysis. 42
IEEE SOFTWARE
education, better interface design, and security-conscious defaults. With software implementation flaws, however, the problems are typically both preventable and well understood. Analyzing reports of security attacks quickly reveals that most attacks do not result from clever attackers discovering new kinds of flaws, but rather stem from repeated exploits of well-known problems. Figure 1 summarizes Mitre’s Common Vulnerabilities and Exposures list of 190 entries from 1 January 2001 through 18 September 2001.1 Thirty-seven of these entries are standard buffer overflow vulnerabilities (including three related memory-access vulnerabilities), and 11 involve format bugs. Most of the rest also reveal common flaws detectable by static analysis, including resource leaks (11), file name problems (19), and symbolic links (20). Only four of the entries involve
January/February 2002
cryptographic problems. Analyses of other vulnerability and incident reports reveal similar repetition. For example, David Wagner and his colleagues found that buffer overflow vulnerabilities account for approximately 50 percent of the Software Engineering Institute’s CERT advisories.2 So why do developers keep making the same mistakes? Some errors are caused by legacy code, others by programmers’ carelessness or lack of awareness about security concerns. However, the root problem is that while security vulnerabilities, such as buffer overflows, are well understood, the techniques for avoiding them are not codified into the development process. Even conscientious programmers can overlook security issues, especially those that rely on undocumented assumptions about procedures and data types. Instead of relying on programmers’ memories, we should strive to produce 0740-7459/02/$17.00 © 2002 IEEE
tools that codify what is known about common security vulnerabilities and integrate it directly into the development process. This article describes a way to codify that knowledge. We describe Splint, a tool that uses lightweight static analysis to detect likely vulnerabilities in programs. Splint’s analyses are similar to those done by a compiler. Hence, they are efficient and scalable, but they can detect a wide range of implementation flaws by exploiting annotations added to programs.
Other 16%
Malformed input 16%
Buffer overflows 19%
Format bugs 6% Resource leaks 6% Pathnames 10%
Figure 1. Common Vulnerabilities and Exposures list for the first nine months of 2001. Most of the entries are common flaws detectable by static analysis, including 37 buffer overflow vulnerabilities.
Mitigating software vulnerabilities “Our recommendation now is the same as our recommendation a month ago, if you haven’t patched your software, do so now.” —Scott Culp, security program manager for Microsoft’s security response center
In this quotation, Culp is commenting on the Internet Information Server’s buffer overflow vulnerability that was exploited by the Code Red worm to acquire over 300,000 zombie machines for launching a denial-of-service attack on the White House Web site. The quotation suggests one way to deal with security vulnerabilities: wait until the bugs are exploited by an attacker, produce a patch that you hope fixes the problem without introducing new bugs, and whine when system administrators don’t install patches quickly enough. Not surprisingly, this approach has proven largely ineffective. We can group more promising approaches for reducing software flaw damage into two categories: mitigate the damage flaws can cause, or eliminate flaws before the software is deployed. Limiting damage Techniques that limit security risks from software flaws include modifying program binaries to insert runtime checks or running applications in restricted environments to limit what they may do.3,4 Other projects have developed safe libraries5 and compiler modifications6 specifically for addressing classes of buffer overflow vulnerabilities. These approaches all reduce the risk of security vulnerabilities while requiring only minimal extra work from application developers. One disadvantage of runtime damagelimitation approaches is that they increase performance overhead. More importantly,
Access 16%
Symbolic links 11%
such approaches do not eliminate the flaw but simply replace it with a denial-of-service vulnerability. Recovering from a detected problem typically requires terminating the program. Hence, although security-sensitive applications should use damage-limitation techniques, the approach should not supplant techniques for eliminating flaws. Eliminating flaws Techniques to detect and correct software flaws include human code reviews, testing, and static analysis. Human code reviews are time-consuming and expensive but can find conceptual problems that are impossible to find automatically. However, even extraordinarily thorough people are likely to overlook more mundane problems. Code reviews depend on the expertise of the human reviewers, whereas automated techniques can benefit from expert knowledge codified in tools. Testing is typically ineffective for finding security vulnerabilities. Attackers attempt to exploit weaknesses that system designers did not consider, and standard testing is unlikely to uncover such weaknesses. Static analysis techniques take a different approach. Rather than observe program executions, they analyze source code directly. Thus, using static analysis lets us make claims about all possible program executions rather than just the test-case execution. From a security viewpoint, this is a significant advantage. There are a range of static analysis techniques, offering tradeoffs between the reJanuary/February 2002
IEEE SOFTWARE
43
Splint finds potential vulnerabilities by checking to see that source code is consistent with the properties implied by annotations.
quired effort and analysis complexity. At the low-effort end are standard compilers, which perform type checking and other simple program analyses. At the other extreme, full program verifiers attempt to prove complex properties about programs. They typically require a complete formal specification and use automated theorem provers. These techniques have been effective but are almost always too expensive and cumbersome to use on even security-critical programs. Our approach We use lightweight static analysis techniques that require incrementally more effort than using a compiler but a fraction of the effort required for full program verification. This requires certain compromises. In particular, we use heuristics to assist in the analysis. Our design criteria eschew theoretical claims in favor of useful results. Detecting likely vulnerabilities in real programs depends on making compromises that increase the class of properties that can be checked while sacrificing soundness and completeness. This means that our checker will sometimes generate false warnings and sometimes miss real problems, but our goal is to create a tool that produces useful results for real programs with a reasonable effort. Splint overview Splint (previously known as LCLint) is a lightweight static analysis tool for ANSI C. Here, we describe Splint, version 3.0.1, which is available as source code and binaries for several platforms under GPL from www.splint.org. We designed Splint to be as fast and easy to use as a compiler. It can do checking no compiler can do, however, by exploiting annotations added to libraries and programs that document assumptions and intents. Splint finds potential vulnerabilities by checking to see that source code is consistent with the properties implied by annotations. Annotations We denote annotations using stylized C comments identified by an @ character following the /* comment marker. We associate annotations syntactically with function parameters and return values, global variables, and structure fields. The annotation /*@notnull@*/, for ex-
44
IEEE SOFTWARE
January/February 2002
ample, can be used syntactically like a type qualifier. In a parameter declaration, the notnull annotation documents an assumption that the value passed for this parameter is not NULL. Given this, Splint reports a warning for any call site where the actual parameter might be NULL. In checking the function’s implementation, Splint assumes that the notnull-annotated parameter’s initial value is not NULL. On a return value declaration, a notnull annotation would indicate that the function never returns NULL. Splint would then report a warning for any return path that might return NULL, and would check the callsite assuming the function result is never NULL. In a global variable declaration, a notnull annotation indicates that the variable’s value will not be NULL at an interface point— that is, it might be NULL within the function’s body but would not be NULL at a call site or return point. Failure to handle possible NULL return values is unlikely to be detected in normal testing, but is often exploited by denial of service attacks. Annotations can also document assumptions over an object’s lifetime. For example, we use the only annotation on a pointer reference to indicate that the reference is the sole long-lived reference to its target storage (there might also be temporary local aliases). An only annotation implies an obligation to release storage. The system does this either by passing the object as a parameter annotated with only, returning the object as a result annotated with only or assigning the object to an external reference annotated with only. Each of these options transfers the obligation to some other reference. For example, the library storage allocator malloc is annotated with only on its result, and the deallocator free also takes an only parameter. Hence, one way to satisfy the obligation to release malloc’s storage is to pass it to free. Splint reports a warning for any code path that fails to satisfy the storage-release obligation, because it causes a memory leak. Although memory leaks do not typically constitute a direct security threat, attackers can exploit them to increase a denial-of-service attack’s effectiveness. In the first half of 2001, three of the Common Vulnerabilities and Exposures entries involved memory leaks (CVE2001-0017, CVE-2001-0041 and CVE-20010055). Some storage management can’t be
modeled with only references, as the programs must share references across procedure and structure boundaries. To contend with this, Splint provides annotations for describing different storage management models.7 Analysis There are both theoretical and practical limits to what we can analyze statically. Precise analysis of the most interesting properties of arbitrary C programs depends on several undecidable problems, including reachability and determining possible aliases.8 Given this, we could either limit our checking to issues like type checking, which do not depend on solving undecidable problems, or admit to some imprecision in our results. Because our goal is to do as much useful checking as possible, we allow checking that is both unsound and incomplete. Splint thus produces both false positives and false negatives. We intend the warnings to be as useful as possible to programmers but offer no guarantee that all messages indicate real bugs or that all bugs will be found. We also make it easy for users to configure checking to suppress particular messages and weaken or strengthen checking assumptions. With static analysis, designers face a tradeoff between precision and scalability. To make our analysis fast and scalable to large programs, we made certain compromises. The most important is to limit our analysis to data flow within procedure bodies. Splint analyzes procedure calls using information from annotations that describes preconditions and postconditions. We made another compromise between flow-sensitive analysis, which considers all program paths, and flowinsensitive analysis, which ignores control flow. Splint considers control-flow paths, but, to limit analysis path blowup, it merges possible paths at branch points. It analyzes loops using heuristics to recognize common idioms. This lets Splint correctly determine the number of iterations and bounds of many loops without requiring loop invariants or abstract evaluation. Splint’s simplifying assumptions are sometimes wrong; this often reveals convoluted code that is a challenge for both humans and automated tools to analyze. Hence, we provide easy ways for programmers to customize checking behavior locally and suppress spurious warnings that result from imprecise analysis.
Buffer overflows Buffer overflow vulnerabilities are perhaps the single most important security problem of the past decade. The simplest buffer overflow attack, stack smashing, overwrites a buffer on the stack, replacing the return address. Thus, when the function returns, instead of jumping to the return address, control jumps to the address the attacker placed on the stack. The attacker can then execute arbitrary code. Buffer overflow attacks can also exploit buffers on the heap, but these are less common and harder to create. C programs are particularly vulnerable to buffer overflow attacks. C was designed with an emphasis on performance and simplicity rather than security and reliability. It provides direct low-level memory access and pointer arithmetic without bounds checking. Worse, the ANSI C library provides unsafe functions—such as gets—that write an unbounded amount of user input into a fixedsize buffer without any bounds checking. Buffers stored on the stack are often passed to these functions. To exploit such vulnerabilities, an attacker merely enters an input larger than the buffer’s size, encoding an attack program binary in the input. Splint detects both stack and heap-based buffer overflow vulnerabilities. The simplest detection techniques just identify calls to often misused functions; more precise techniques depend on function descriptions and program-value analysis.
Buffer overflow vulnerabilities are perhaps the single most important security problem of the past decade.
Use warnings The simplest way to detect possible buffer overflows is to produce a warning whenever the code uses library functions susceptible to buffer overflow vulnerabilities. The gets function is always vulnerable, so it seems reasonable for a static analysis tool to report all uses of gets. Other library functions, such as strcpy, can be used safely but are often the source of buffer overflow vulnerabilities. Splint provides the annotation warn flag-specifier message, which precedes a declaration to indicate that declarator use should produce a warning. For example, the Splint library declares gets with /*@warn bufferoverflowhigh “Use of gets leads to … “@*/
January/February 2002
IEEE SOFTWARE
45
Splint resolves preconditions using postconditions from previous statements and any annotated preconditions for the function.
to indicate that Splint should issue a warning message whenever gets is used and the bufferoverflowhigh flag is set. Several security scanning tools provide similar functionality, including Flawfinder (www. dwheeler.com/flawfinder), ITS4,9 and the Rough Auditing Tool for Security (www. securesw.com/rats). Unlike Splint, however, these tools use lexical analysis instead of parsing the code. This means they will report spurious warnings if the names of the vulnerable functions are used in other ways (for example, as local variables). The main limitation of use warnings is that they are so imprecise. They alert humans to possibly dangerous code but provide no assistance in determining whether a particular use of a potentially dangerous function is safe. To improve the results, we need a more precise specification of how a function might be safely used, and a more precise analysis of program values. Describing functions Consider the strcpy function: it takes two char * parameters (s1 and s2) and copies the string that the second parameter points to into the buffer to which the first parameter points. A call to strcpy will overflow the buffer pointed to by the first parameter if that buffer is not large enough to hold the string pointed to by the second parameter. This property can be described by adding a requires clause to the declaration of strcpy: /*@requires maxSet(s1)>= maxRead(s2) @*/. This precondition uses two buffer attribute annotations, maxSet and maxRead. The value of maxSet(b) is the highest integer i such that b[i] can be safely used as an lvalue (that is, on the left side of an assignment expression). The value of maxRead(b) is the highest integer i such that b[i] can be safely used as an rvalue. The s2 parameter also has a nullterminated annotation that indicates that it is a nullterminated character string. This implies that s2[i] must be a NUL character for some i <= maxRead(s2). At a call site, Splint produces a warning if a precondition is not satisfied. Hence, a call strcpy (s, t) would produce a warning if Splint cannot determine that maxSet(s)>= maxRead(t). The warning would indicate that the buffer allocated for s might be overrun by the strcpy call.
46
IEEE SOFTWARE
January/February 2002
Analyzing program values Splint analyzes a function body starting from the annotated preconditions and checks that the function implementation ensures the postconditions. It generates preconditions and postconditions at the expression level in the parse tree using internal rules or, in the case of function calls, annotated descriptions. For example, the declaration char buf[MAXSIZE] generates the postconditions maxSet(buf) = MAXSIZE – 1 and minSet(buf) = 0. Where the expression buf[i] is used as an lvalue, Splint generates the precondition maxSet(buf) >= i. All constraint variables also identify particular code locations. Because a variable’s value can change, the analysis must distinguish between values at different code points. Splint resolves preconditions using postconditions from previous statements and any annotated preconditions for the function. If it cannot resolve a generated precondition at the beginning of a function or satisfy a documented postcondition at the end, Splint issues a descriptive warning about the unsatisfied condition. Hence, for the buf[i] example above, Splint would produce a warning if it cannot determine that the value of i is between 0 and MAXSIZE – 1. Splint propagates constraints across statements using an axiomatic semantics and simplifies constraints using constraint-specific algebraic rules, such as maxSet(ptr + i) = maxSet(ptr) - i.
To handle loops, we use heuristics that recognize common loop forms.10 Our experience indicates that a few heuristics can match many loops in real programs. This lets us effectively analyze loops without needing loop invariants or expensive analyses. Extensible checking In addition to the built-in checks, Splint provides mechanisms for defining new checks and annotations to detect new vulnerabilities or violations of application-specific properties. A large class of useful checks can be described as constraints on attributes associated with program objects or the global execution state. Unlike types, however, the values of these attributes can change along an execution path. Splint provides a general language that lets users define attributes associated with
attribute taintedness context reference char * oneof untainted, tainted annotations tainted reference ==> tainted untainted reference ==> untainted transfers tainted as untainted ==> error “Possibly tainted storage used as untainted.” merge tainted + untainted ==> tainted defaults reference ==> tainted literal ==> untainted null ==> untainted end
different kinds of program objects as well as rules that both constrain the values of attributes at interface points and specify how attributes change. The limited expressiveness of user attributes means that Splint can check user-defined properties efficiently. Because user-defined attribute checking is integrated with normal checking, Splint’s analysis of user-defined attributes can take advantage of other analyses, such as alias and nullness analysis. Next, we illustrate how user-defined checks can detect new vulnerabilities using a taintedness attribute to detect format bugs. We have also used extensible checking to detect misuses of files and sockets (such as failing to close a file or to reset a read/write file between certain operations) and incompatibilities between Unix and Win32.11 Detecting format bugs In June 2000, researchers discovered a new class of vulnerability: the format bug.12 If an attacker can pass hostile input as the format string for a variable arguments routine such as printf, the attacker can write arbitrary values to memory and gain control over the host in a manner similar to a buffer overflow attack. The %n directive is particularly susceptible to attack, as it treats its corresponding argument as an int *, and stores the number of bytes printed so far in that location. A simple way to detect format vulnerabilities is to provide warnings for any format string that is unknown at compile time. If the +formatconst flag is set, Splint issues a warning at all callsites where a format string is not known at compile time. This can produce spurious messages, however, because there might be unknown format strings that are not vulnerable to hostile input.
Figure 2. Definition of the taintedness attribute.
A more precise way to detect format bugs is to only report warnings when the format string is derived from potentially malicious data (that is, when it comes from the user or external environment). Perl’s taint option13 suggests a way to do this. The taint option, which is activated by running Perl with the -T flag, considers all user input as tainted and produces a runtime error, halting execution before a tainted value is used in an unsafe way. Untainted values can be derived from tainted input by using Perl’s regular expression matching. Taintedness attribute Splint can be used to detect possibly dangerous operations with tainted values at compile time. To accomplish this, we define a taintedness attribute associated with char * objects and introduce the annotations tainted and untainted to indicate assumptions about the taintedness of a reference. Umesh Shankar and his colleagues used a similar approach.14 Instead of using attributes with explicit rules, they used type qualifiers. This lets them take advantage of type theory, and, in particular, use wellknown type-inference algorithms to automatically infer the correct type qualifiers for many programs. Splint’s attributes are more flexible and expressive than type qualifiers. Figure 2 shows the complete attribute definition. The first three lines define the taintedness attribute associated with char * objects, which can be in one of two states: untainted or tainted. The next clause specifies rules for transferring objects between references, for example, by passing a parameter or returning a result. The tainted as untainted ==> error rule directs Splint to report a warning when a tainted object is used where an untainted object is expected. This would ocJanuary/February 2002
IEEE SOFTWARE
47
Using Splint is an iterative process.
cur if the system passed a tainted object as an untainted parameter or returned it as an untainted result. All other transfers (for example, untainted as tainted) are implicitly permitted and leave the transferred object in its original state. Next, the merge clause indicates that combining tainted and untainted objects produces a tainted object. Thus, if a reference is tainted along one control path and untainted along another control path, checking assumes that it is tainted after the two branches merge. It is also used to merge taintedness states in function specifications (see the strcat example in the next section). The annotations clause defines two annotations that programmers can use in declarations to document taintedness assumptions. In this case, the names of the annotations match the taintedness states. The final clause specifies default values used for declarators without taintedness annotations. We choose default values to make it easy to start checking an unannotated program. Here we assume unannotated references are possibly tainted and Splint will report a warning where unannotated references are passed to functions that require untainted parameters. The warnings indicate either a format bug in the code or a place where an untainted annotation should be added. Running Splint again after adding the annotation will propagate the newly documented assumption through the program. Specifying library functions When the source code for library code is unavailable, we cannot rely on the default annotations because Splint needs the source code to detect inconsistencies. We must therefore provide annotated declarations that document taintedness assumptions for standard library functions. We do this by providing annotated declarations in the tainted.xh file. For example, int printf (/*@untainted@*/ char *fmt, ...);
indicates that the first argument to printf must be untainted. We can also use ensures clauses to indicate that a value is tainted after a call returns. For example, the first parameter to fgets is tainted after fgets returns: 48
IEEE SOFTWARE
January/February 2002
char *fgets (/*@returned@*/ char *s, int n, FILE *stream) /*@ensures tainted s@*/ ;
The returned annotation on the parameter means that the return value aliases the storage passed as s, so the result is also tainted (Splint’s alias analysis also uses this information). We also must deal with functions that might take tainted or untainted objects, but where the final taintedness states of other parameters and results might depend on the parameters’ initial taintedness states. For example, strcat is annotated this way: char *strcat (/*@returned@*/ char *s1, char *s2) /*@ensures s1:taintedness = s1:taintedness | s2:taintedness@*/
Because the parameters lack annotations, they are implicitly tainted according to the default rules, and either untainted or tainted references can be passed as parameters to strcat. The ensures clause means that after strcat returns, the first parameter (and the result, because of the returned annotation on s1) will be tainted if either passed object was tainted. Splint merges the two taintedness states using the attribute definition rules—hence, if the s1 parameter is untainted and the s2 parameter is tainted, the result and first parameter will be tainted after strcat returns. Experience Using Splint is an iterative process. First, we run Splint to produce warnings and then change either the code or the annotations accordingly. Next, we run Splint again to check the changes and propagate the newly documented assumptions. We continue this process until Splint issues no warnings. Because Splint checks approximately 1,000 lines per second, running Splint repeatedly is not burdensome. Splint’s predecessor, LCLint, has been used to detect a range of problems, including data hiding15 and memory leaks, dead storage usage, and NULL dereferences7 on programs comprising hundreds of thousands of lines of code. LCLint is widely used
Table 1 False warnings checking wu-ftpd Cause
by working programmers, especially in the open-source development community.16,17 So far, our experience with buffer overflow checking and extensible checking is limited but encouraging. We have used Splint to detect both known and previously unknown buffer overflow vulnerabilities in wu-ftpd, a popular ftp server, and BIND, the libraries and tools that comprise the Domain Name System’s reference implementation. Here, we summarize our experience analyzing wu-ftpd version 2.5.0, a 20,000 line program with known (but not known specifically to the authors when the analysis was done) format and buffer overflow bugs. We detected the known flaws as well as finding some previously unknown flaws in wu-ftpd. It takes Splint less than four seconds to check all of wu-ftpd on a 1.2-GHz Athlon machine. Format bugs Running Splint on wu-ftpd version 2.5.0 produced two warnings regarding taintedness. The first one was: ftpd.c: (in function vreply) ftpd.c:4608:69: Invalid transfer from implicitly tainted fmt to untainted (Possibly tainted storage used as untainted.): vsnprintf(..., fmt, ...) ftpd.c:4586:33: fmt becomes implicitly tainted
In tainted.xh, vsnprintf is declared with an untainted annotation on its format string parameter. The passed value, fmt, is a parameter to vreply, and hence it might be tainted according to the default rules. We added an untainted annotation to the fmt parameter declaration to document the assumption that an untainted value must be passed to vreply. After adding the annotation, Splint reported three warnings for possibly tainted values passed to vreply in reply and lreply. We thus added three additional annotations. Running Splint again produced five warnings—three of which involved passing the global variable globerr as an untainted parameter. Adding an untainted annotation to the variable declaration directed Splint to ensure that globerr is never tainted at an interface point. The other warnings concerned possibly tainted values passed to lreply in
External assumptions Arithmetic limitations Alias analysis Flow control Loop heuristics Other
Number
Percent
6 13 3 20 10 24
7.9 17.1 3.9 26.3 13.2 31.6
site_exec. Because these values were obtained from a remote user, they constituted a serious vulnerability (CVE-2000-0573). The second message Splint produced in the first execution reported a similar invalid transfer in setproctitle. After adding an annotation and rerunning Splint, we found two additional format string bugs in wuftpd. These vulnerabilities, described in CERT CA-2000-13, are easily fixed using the %s constant format string. We also ran Splint on wu-ftpd 2.6.1, a version that fixed the known format bugs. After adding eight untainted annotations, Splint ran without reporting any format bug vulnerabilities.
Buffer overflow vulnerabilities Running Splint on wu-ftpd 2.5 without adding annotations produced 166 warnings for potential out-of-bounds writes. After adding 66 annotations in an iterative process such as the one we described above for checking taintedness, Splint produced 101 warnings. Twenty-five of these warnings indicated real problems and 76 were false (summarized in Table 1). Six of the false warnings resulted because Splint was unaware of assumptions external to the wu-ftpd code. For example, wu-ftpd allocates an array based on the system constant OPEN_MAX, which specifies the maximum number of files a process can have open. The program then writes to this buffer using the integer value of an open file stream’s file descriptor as the index. This is safe because the file descriptor’s value is always less than OPEN_MAX. Without a more detailed specification of the meaning of file descriptor values, there is no way for a static analysis tool to determine that the memory access is safe. Ten false warnings resulted from loops that were correct but did not match the loop heuristics. To some extent, we could address this by incorporating additional loop January/February 2002
IEEE SOFTWARE
49
No tool will eliminate all security risks, but lightweight static analysis should be part of the development process for securitysensitive applications.
50
IEEE SOFTWARE
heuristics into Splint, but there will always be some unmatched loops. The remaining 60 spurious messages resulted from limitations in Splint’s ability to reason about arithmetic, control flow, and aliases. We’re optimistic that implementing known techniques into Splint will overcome many of these limitations without unacceptable sacrifices in efficiency and usability. It is impossible, though, to eliminate all spurious messages because of the general undecidability of static analysis.
L
ightweight static analysis is a promising technique for detecting likely software vulnerabilities, helping programmers fix them before software is deployed rather than patch them after attackers exploit the problem. Although static analysis is an important approach to security, it is not a panacea. It does not replace runtime access controls, systematic testing, and careful security assessments. Splint can only find problems that are revealed through inconsistencies between the code, language conventions, and assumptions documented in annotations. Occasionally, such inconsistencies reveal serious design flaws, but Splint offers no general mechanisms for detecting high-level design flaws that could lead to security vulnerabilities. The effort involved in annotating programs is significant, however, and limits how widely these techniques will be used in the near future. Providing an annotated standard library solves part of the problem. However, it does not remove the need to add annotations to user functions where correctness depends on documenting assumptions that cross interface boundaries. Much of the work in annotating legacy programs is fairly tedious and mechanical, and we are currently working on techniques for automating this process. Techniques for combining runtime information with static analysis to automatically guess annotations also show promise.18 No tool will eliminate all security risks, but lightweight static analysis should be part of the development process for security-sensitive applications. We hope that the
January/February 2002
security community will develop a tool suite that codifies knowledge about security vulnerabilities in a way that makes it accessible to all programmers. Newly discovered security vulnerabilities should not lead just to a patch for a specific program’s problem, but also to checking rules that detect similar problems in other programs and prevent the same mistake in future programs. Lightweight static checking will play an important part in codifying security knowledge and moving from today’s penetrate-and-patch model to a penetratepatch-and-prevent model where, once understood, a security vulnerability can be codified into tools that detect it automatically.
Acknowledgments David Evans’ work is supported by an NSF Career Award and a NASA Langley Research Grant. David Larochelle’s work is supported by a Usenix Student Research Grant.
References 1. Common Vulnerabilities and Exposures, version 20010918, The Mitre Corporation, 2001; http://cve.mitre.org (current Nov. 2001). 2. D. Wagner et al., “A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities,” Proc. 2000 Network and Distributed System Security Symp., Internet Society, Reston, Va., 2000; www.isoc.org/ndss2000/ proceedings (current Nov. 2001). 3. I. Goldberg et al., “A Secure Environment for Untrusted Helper Applications: Confining the Wily Hacker,” Proc. Sixth Usenix Security Symp., Usenix Assoc., Berkeley, Calif., 1996; www.cs.berkeley.edu/~daw/papers/ janus-usenix96.ps (current Nov. 2001). 4. D. Evans and A. Twyman, “Flexible Policy-Directed Code Safety,” IEEE Symp. Security and Privacy, IEEE CS Press, Los Alamitos, Calif., 1999, pp. 32–45. 5. A. Baratloo, N. Singh, and T. Tsai, “Transparent RunTime Defense Against Stack-Smashing Attacks,” Proc. Ninth Usenix Security Symp., Usenix Assoc., Berkeley, Calif., 2000; www.usenix.org/events/usenix2000/ general/baratloo.html (current Nov. 2001). 6. C. Cowan et al., “StackGuard: Automatic Adaptive Detection and Prevention of Buffer Overflow Attacks,” Proc. Seventh Usenix Security Symp., Usenix Assoc., Berkeley, Calif., 1998; http://immunix.org/ StackGuard/usenixsc98.pdf (current Nov. 2001). 7. D. Evans, “Static Detection of Dynamic Memory Errors,” SIGPLAN Conf. Programming Language Design and Implementation, ACM Press, New York, 1996, pp. 44–53. 8. G. Ramalingam, “The Undecidability of Aliasing,” ACM Trans. Programming Languages and Systems, vol. 16, no. 5, 1994, pp. 1467–1471. 9. J. Viega et al., “ITS4 : A Static Vulnerability Scanner for C and C++ Code,” Proc. Ann. Computer Security Applications Conf., IEEE CS Press, Los Alamitos, Calif., 2000; www.acsac.org/2000/abstracts/78.html (current Nov. 2001). 10. D. Larochelle and D. Evans, “Statically Detecting
11.
12.
13.
14.
15.
16.
17.
Likely Buffer Overflow Vulnerabilities,” Proc. 10th Usenix Security Symp., Usenix Assoc., Berkeley, Calif., 2001; www.usenix.org/events/sec01/larochelle.html (current Nov. 2001). C. Barker, “Static Error Checking of C Applications Ported from UNIX to WIN32 Systems Using LCLint,” senior thesis, Dept. Computer Science, University of Virginia, Charlottesville, 2001. C. Cowan et al., “FormatGuard: Automatic Protection From printf Format String Vulnerabilities,” Proc. 10th Usenix Security Symp., Usenix Assoc., Berkeley, Calif., 2001; www.usenix.org/events/sec01/cowanbarringer. html (current Nov. 2001). L. Wall, T. Christiansen, and J. Orwant, Programming Perl, 3rd edition, O’Reilly & Associates, Sebastopol, Calif., 2000. U. Shankar et al., “Detecting Format String Vulnerabilities with Type Qualifiers,” Proc. 10th Usenix Security Symp., Usenix Assoc., Berkeley, Calif., 2001; www. usenix.org/events/sec01/shankar.html (current Nov. 2001). D. Evans et al., “LCLint: A Tool for Using Specifications to Check Code,” SIGSOFT Symp. Foundations of Software Eng., ACM Press, New York, 1994; www.cs. virginia.edu/~evans/sigsoft94.html (current Nov. 2001). D. Santo Orcero, “The Code Analyzer LCLint,” Linux Journal, May 2000; www.linuxjournal.com/article. php?sid=3599 (current Nov. 2001). C.E. Pramode and C.E. Gopakumar, “Static Checking of C programs with LCLint,” Linux Gazette, Mar. 2000; www.linuxgazette.com/issue51/pramode.html (current Nov. 2001).
18.
M.D. Ernst et al., “Dynamically Discovering Likely Program Invariants to Support Program Evolution,” Proc. Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 1999, pp. 213–224.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
About the Authors David Evans is an assistant professor in University of Virginia’s Department of Computer
Science. His research interests include annotation-assisted static checking and programming swarms of computing devices. He received BS, MS, and PhD in computer science from the Massachusetts Institute of Technology. Contact him at the Department of Computer Science, School of Engineering & Applied Science, University of Virginia, 151 Engineer’s Way, P.O. Box 400740; Charlottesville, VA 22904-4740; [email protected] or www.cs.virginia.edu/evans.
David Larochelle is a PhD student in University of Virginia’s Department of Computer
Science, where he works on lightweight static analysis with a focus on security. He has a BS in computer science from the College of William and Mary in Williamsburg, Virginia, and an MCS in computer science from the University of Virginia. Contact him at the Department of Computer Science, School of Engineering & Applied Science, University of Virginia, 151 Engineer’s Way, P.O. Box 400740, Charlottesville, VA 22904-4740; [email protected] or www.ds. virginia.edu/larochelle.
What do software engineering professionals need to know, according to those who hire and manage them? They must be able to produce secure and high-quality systems in a timely, predictable, and cost-effective manner. This special issue will focus on the methods and techniques for enhancing software education programs worldwide—academic, re-education, alternative—to give graduates the knowledge and skills they need for an industrial software career.
For more information about the focus, contact the guest editors; for author guidelines and submission details, contact the magazine assistant at [email protected] or go to http:// computer.org/software/author.htm. Submissions are due at [email protected] on or before 1 April 2002. If you would like advance editorial comment on a proposed topic, send the guest editors an extended abstract by 1 February; they will return comments by 15 February.
Potential topics include
• • • • • • • • • • • •
balancing theory, technology, and practice experience reports with professional education software processes in the curriculum teaching software engineering practices (project management, requirements, design, construction, …) quality and security practices team building software engineering in beginning courses computer science education vs. SE education undergraduate vs. graduate SE education nontraditional education (distance education, asynchronous learning, laboratory teaching, …) innovative SE courses or curricula training for the workplace
Manuscripts must not exceed 5,400 words including figures and tables, which count for 200 words each. Submissions in excess of these limits may be rejected without refereeing. The articles we deem within the theme's scope will be peer-reviewed and are subject to editing for magazine style, clarity, organization, and space. We reserve the right to edit the title of all submissions. Be sure to include the name of the theme for which you are submitting an article. Guest Editors:
Watts S. Humphrey, Software Engineering Institute Carnegie Mellon University, [email protected] Thomas B. Hilburn, Dept. of Computing and Mathematics Embry-Riddle Aeronautical University [email protected]
January/February 2002
IEEE SOFTWARE
51
feature metrics
Collecting and Analyzing Web-Based Project Metrics Rob Pooley, Heriot-Watt University Dave Senior and Duncan Christie, Agilent
ny product schedule or quality problem can have a major impact on business success, including missed release dates and defects being discovered at customer sites. Implementing a system to apply metrics—values that concisely capture useful information—can help prevent, identify, and resolve such problems. Agilent’s Telecom Systems Division’s R&D department develops large software systems comprising many product elements. While the TSD had
A Metrics generated from real project data provide a valuable tool for evaluating and guiding performance. The authors describe their experience developing a Web-based project metrics system and encouraging its adoption. 52
IEEE SOFTWARE
software configuration management tools in place, metrics were not part of the process. Its SCM process uses two related tools: ClearCase, for version control and build management, and the Distributed Defect Tracking System, for change management.1 As a byproduct of their role in SCM, these tools capture a wealth of data about the division’s software development process, but the TSD uses only a small part of this data for tracking and managing defects and enhancements for each software release. Thus we had an interesting situation, in which measurement was already in place without metrics being generated. This enabled us to investigate the impact and success of introducing a metrics program in an industrial environment. In partnership with the Department of Computing and Electrical Engineering at Heriot-Watt University, the TSD developed a Web-based pilot system (see Figure 1) to
January/February 2002
generate useful metrics from the SCM data. All the examples we describe come from real projects (thanks to the kind permission of Agilent). Adopting a metrics attitude As software becomes a bigger and more important part of our lives, our expectations and demands also increase. As a result, software quality assurance is more an issue now than ever before. Our fundamental aim in measuring software and the software process is to increase understanding and thus increase quality. The key reasons for measuring software are to2 ■ ■
characterize software engineering processes and products, evaluate project status with respect to plans, 0740-7459/02/$17.00 © 2002 IEEE
Workstation interface
Software configuration management tools ■ ■
predict so that projects can be planned, and guide improvements of software engineering processes.
ClearCase
Version control
Change management
After you know what your organization does and how it operates, you can try to plan, evaluate, and make improvements. Measurement is the only mechanism available for quantifying and identifying the characteristics needed to achieve this. Software measurement is also the basis for creating and maintaining metrics. Metrics indicate process or product qualities that aren’t directly measurable and, if not addressed by metrics, are subjective and intangible. A software metric can define a standard way of measuring some attribute of the software development process, such as size, cost, defects, communications, difficulty, or environment. Metrics can range from the primitive (directly measurable or countable, such as total number of outstanding defects) to complex and derived (such as number of noncommented source statements per engineer per month). You can measure a huge variety of attributes for any single project. The attributes measured depend on the described entities. For example, what you measure when evaluating a development process will differ from what you measure when evaluating system or module progress. Before measurement begins, what you will derive from the data must be clear; then, you can successfully define and measure the appropriate attributes. Typically, the attributes measured fall into the following core categories:3 ■ ■ ■ ■ ■ ■
cost (staff effort, phase effort, total effort); errors (date found or corrected, effort required, error source and class); process characteristics (development language, process model, technology); project dynamics (changes or growth in requirements or code); project characteristics (development dates, project size, total effort); and software structure (size, complexity).
Measuring problems and defects helps us control software product quality, cost, and schedules and thus product development and maintenance. It also helps us determine the status of corrective actions, measure and
Web interface Distributed Defect Tracking System
Project managers, engineers, support, verification
Software metrics system
Figure 1. The context for implementing a software metrics system at Agilent.
improve software development processes, and (to the extent possible) predict remaining defects or failure rates.2 Even by measuring one specific area, we gain immediate insight into various other attributes associated with the software project. Management attitudes The biggest challenge in using metrics can be convincing managers of their value.4,5 When selling software metrics, emphasize how they can help manage processes. Case studies and examples easily illustrate this, but it is harder to maintain interest over a long period. Results and benefits are not immediate, and it can take years to fully implement a successful software metrics program. The main effort, deciding what to measure and how to measure it, requires financial commitment as well as manpower. Software metrics can help project managers more accurately estimate project milestones, especially when historical data is available. They are also a useful mechanism for monitoring progress; plotting weekly or monthly changes reveals trends that can predict problem areas early enough to take action. Engineer attitudes Software process measurement is a critical element for success, but it won’t work unless people agree to use it. Engineers don’t like metrics applied to them; carefully consider their perceptions and needs. The best way to deal with this is to ensure that you won’t use the data against them. January//February 2002
IEEE SOFTWARE
53
1,100 Submitted_on Date_closed Outstanding
Number of ClearTrack requests (defects)
1,000 900
trous consequences, especially with engineers. However, this doesn’t mean metrics should never be used with engineer data; in some cases, this can be very useful. The key is never to use metrics in a threatening way. Also, consider a person’s reaction to change, especially when introducing widespread measurement. It is always difficult to get people to accept change, even in the smallest way. Although this might seem a trivial problem, it is enough to cause a software metrics project to fail.4,5
800 700 600 500 400 300 200 100 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1998 1998 1998 1998 1998 1998 1998 1998 1998 1998 1998 1998 Time (months)
Figure 2. Cumulative defects generated from data in the DDTS database.
Number of ClearTrack requests (defects)
500
400
300
System definition and development Our primary motivations for developing a Web-based system were to facilitate multiuser access and make the system platformindependent. We built the system in Perl and KShell using HTML, JavaScript, and CGI bin scripts.6,7 We applied a simple processoriented design to a rapid application development model, so that we could swiftly prototype and revise requirements. After several meetings with the TSD’s project managers, technical leads, and developers, we defined a variety of key metrics. For the final system, we designed and implemented three key ones: ■
200
100
0
er
Oth
Co
de
n
sig
De
e
Sp
up l/s
nta
me
on
vir
En
cif
rt
po
qu
e
s/r
on
ti ica
ts
en
m ire
Do
n
tio
a ent cum
Root cause
Figure 3. The root cause of cumulative defects. ■
The problem is that the data needed for simple metrics and monitoring can generate more personal metrics. The temptation is to identify a project’s weak points in terms of individual engineers and to act to maximize productivity. However, metrics are not consistent or understood enough for this purpose. If metrics cannot fully explain a software project, how can we use them to evaluate people? The misuse of metrics data can lead to disas54
IEEE SOFTWARE
January/February 2002
ClearTrack requests over time. The Distributed Defect Tracking System database stores all the defects and enhancements raised for all TSD projects. These database entries, called ClearTrack requests, reflect the name of the system before DDTS. To indicate progress, we plot on a graph the number of raised or closed CTRs against time (see Figure 2). This metric lets us extract data at various granularity levels (for example, at the project or product level) and compare previous versions of the same project or product. Detailed measurement makes this metric flexible. Root cause analysis. After fixing a defect, the engineer records the root cause, which indicates the state of the life cycle where the defect originated. Each root cause entry contains three fields: the life-cycle stage in which the defect originated, the type of fault, and the specific cause of the error. Examples include {Code, Logic, Wrong} and {Design, Error Checking, Missing}. By calculating and plotting the total number
■
and type of defects in each phase for a particular project or version, the project manager can identify the main origins of defects (see Figure 3). By examining the root causes of a project’s defects, we can also identify the consistency and correctness of the metric. The first and second fields provide relatively good data while the third field, specific cause, is less reliable. Time to fix/CTR age. This metric calculates the maximum, minimum, and average time it takes to fix CTRs for a specific project or product. Each CTR has an associated severity—that is, how much adverse impact a defect has or how important the new functionality is. By examining the severity of a project’s CTRs, we can identify trends and discrepancies. For example, if we expect a severity-1 CTR to close more quickly than a severity-4 one and the opposite occurs, the severity-4 CTR might be less severe than originally thought and need reclassification. We can also analyze CTR age (how old unresolved CTRs are) on a project or product level.
Evaluation Although we can measure usability in many ways, in this case the system’s usefulness for project managers and engineers to monitor project status determines its usability. For example, can a manager use the system to generate meaningful data that will aid project management? Because different managers use different mechanisms to monitor and manage projects, we decided to evaluate four managers, each from a different project, and their use of all three metrics to get the greatest diversity of feedback possible. During the evaluation, we asked the managers if they currently used metrics. Most used customized SQL queries to the DDTS database to retrieve data and manually generate reports. The reports ranged from simple counts (project defects open and closed) to specific reports (defects verified last week, by submitter). Using custom queries has advantages and disadvantages; for example, the project manager has the flexibility to generate highly specific metrics, but it is easy to use the wrong queries or make mistakes. This is especially true
when comparing reports run by other project managers; often, they look similar but were produced using different queries, so the results differ substantially. In some extreme cases, no metrics were used at all. Providing a standard metrics toolset enables comparison without the ambiguity present in ad hoc reports. The project managers’ responses indicated that they wanted a standard toolset as well. The graphs produced by the metrics system showed some basic areas that required attention, highlighting the need for even the simplest monitoring. Further investigation indicated infrequent use of complex metrics, where trending could be applied. Users of the new system can quickly investigate the use of trending metrics and complex reporting on a large scale. The requirements evaluations also revealed that the project managers had positive attitudes toward metrics as a tool, but there was a definite need to standardize the process. As expected, all four project managers were interested in all three metrics. The CTRs-over-time metric was of most interest, because it uses data they all had already used for project monitoring (although only one used the data in a graphical form). This meant they could immediately relate to this metric and quickly verify the results. Our system generates this metric as a graph showing the trend over time, making analysis easier. One project ran this metric monthly, so that the manager could make direct comparisons. There was a similar response to the root cause analysis metric, because two of the managers were already using it in reports. The time-to-fix/CTR age metric was rarely used and thus was less interesting. Once we demonstrated it, however, the managers thought this metric would also be useful.
Providing a standard metrics toolset enables comparison without the ambiguity present in ad hoc reports.
CTRs over time The project managers raised the following points about CTRs over time: 1. On its own, the graph isn’t a useful indicator of progress. The CTRs-over-time metric provides a simple count of defects raised or closed on a certain date, but it doesn’t reflect at which point during the project life cycle they occur. On the other hand, the managers said that the graph produced by this metric coupled with a January/February 2002
IEEE SOFTWARE
55
Number of ClearTrack requests (defects)
160 140
Submitted_on Date_closed Outstanding
120 100 80 60 40 20 0
Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr 1998 1998 1998 1998 1998 1998 1998 1998 1998 1999 1999 1999 1999 Time (months)
Figure 4. Plot of CTRs over time. Note the sharp climb in closures after the project manager corrected an implementation problem.
Manhattan score graph, where each defect’s score depends on its current state and severity, would provide this information. This emphasized that trending is more important than absolute numbers. 2. The DDTS visibility field is ambiguous. DDTS requires a defect to be classified as either internal or external and for one of these values to be set in its visibility field. Most of the graphs that DDTS produces focus on external defects, because they are the main indicators for project monitoring. However, what classifies a CTR as being external or internal? An external CTR is customer-facing—for example, adding new functionality or fixing a customer-reported defect. An internal CTR is invisible to the customer—for example, restructuring or implementing new algorithms. Users of DDTS are often genuinely confused about which of these values should be used in the visibility field, but misuse exists as well. In some projects, an independent person verifies all external defects. In these cases, visibility becomes “Does it need to be independently verified?” rather than “Can the customer see it?” This interpretation is clearly incorrect. 3. Severity should replace priority as a tool option. The CTRs-over-time metric provides the option to select CTR priority, which is for internal TSD use and determines the order in which to address CTRs. However, project managers do 56
IEEE SOFTWARE
January/February 2002
not use it widely. Severity, which managers frequently use in their reports, is customer oriented, detailing how much the defect affects the system. We cannot release products that have critical defects. By adding severity to this metric, we can show breakdown by severity—a cross section of the data that we cannot see otherwise. 4. Do not report data at the engineer level. Some managers run reports showing data broken down at an engineer level. While this could be useful, we deliberately omitted it because individual metrics can be highly sensitive. (The same point applies to the other metrics.) Figure 4 illustrates one use of our metrics system—to identify implementation discrepancies and problems. When evaluating a particular project, we found it strange that the engineering team had closed only a few defects. It turned out that the DDTS system wasn’t being used correctly. The figure shows a sharp climb in closures after the project manager learned about and fixed the problem. Does CTR age really indicate time to fix? The project managers raised only one major problem with the time-to-fix metric: When does a CTR become closed? Each CTR follows a set path: It is raised, then assigned to an engineer who implements and tests a solution. However, a CTR isn’t officially closed until it is independently verified, so there is a lag between submission and official closure. DDTS’s date_closed field, which is undefined until a CTR is closed, only receives a date entry once the CTR is verified. This causes some of the values generated by this metric to be incorrect, because of the process that some projects follow. Generating the correct data is possible, because DDTS stores all dates regarding state transition. In this case, it would be more useful to query the tested_on date. This would also let us examine single projects and verification rates. This obviously has an effect on the CTRs-over-time metric but to a lesser extent, because that metric is more concerned with identifying trends. Tables 1 and 2 illustrate the data distortion that occurs when the verification lag is not accounted for. We generated the tables with the
Table 1 CTR time to fix according to date_closed and tested_on metrics Critical date_closed Max 64 Min 39 Average 54.33 Count 3
High
tested_on 8 3 6.25 4
date_closed 196 11 88.60 10
Medium
tested_on 59 1 23.18 11
date_closed 398 0 116.62 16
Low
tested_on 136 0 27.06 17
date_closed 472 0 69.37 27
Project
tested_on 472 0 63.04 25
date_closed tested_on 472 472 0 0 85.50 40.63 56 57
Table 2 CTR age according to date_closed and tested_on metrics date_closed Max 176 Min 176 Average 176 Count 1
tested_on — — — 0
High date_closed 258 258 258 1
Medium
tested_on — — — 0
metric system using both date_closed and tested_on. The time-to-fix data clearly shows that the average project fix time is halved. The defect age table shows typical data for a project where there should be very few critical defects open. In this case, the lag is not a problem, but the use of date_closed still produces misleading data; that is, it found a very old, unverified critical defect. This does not mean that date_closed should not be used in some cases. It catches defects that aren’t verified but should be. Root cause analysis This metric is conceptually much simpler than the other two, so the users found less ambiguity and fewer problems. The main difficulty with it is in the data it uses and how valid it is. Root cause has only recently become usable from a metrics point of view. DDTS provides a more structured approach to root cause analysis than was possible before, but it requires careful use to ensure that the data entered is correct. As mentioned earlier, a root cause entry consists of three fields. The first two appear reliable enough for metrics, particularly to compare different releases for improved use. Incidentally, this metric is a good indicator of the user’s progress in entering DDTS data correctly. Figure 5 illustrates one use of the metrics system identified during the evaluation. Release A showed a high bias toward the root cause labeled “other.” This isn’t helpful because it tells nothing about defects. Either an appropriate category doesn’t exist or the user
date_closed 342 85 224.50 4
Low
tested_on 563 85 330 3
date_closed 484 126 257.38 8
Project tested_on 560 127 510.70 10
date_closed tested_on 484 563 85 85 242.21 469 56 13
96 Number of ClearTrack requests (defects)
Critical
Release A Release B
80
64
48
32
16
0
er
Oth
Co
de
n
sig
De
e
cif
Sp
e
s/r
on
ti ica
ts
en
em
ir qu
me
on
vir
En
rt
po
sup
l/ nta
Do
n
tio
a ent cum
Root cause
Figure 5. A root cause example for two releases. The amorphous category of “other” for the older release is not helpful because it is so heavily populated; the distribution of root causes is more balanced in the newer release.
failed to perform the analysis correctly. Release B showed a more balanced selection of root causes, with the majority lying in code and design. Release A is older than release B and demonstrates the success of the root cause user training given between these releases. January/February 2002
IEEE SOFTWARE
57
Observations
Measuring code size not only lets us monitor productivity but also lets us develop metrics such as defect density.
As we developed the metrics system, identifying each metric’s core requirements—what data to use and how to present it—was easy, because the metrics were simple and the DDTS system contained lots of data. During prototype testing and final evaluations, we also quickly found each metric’s limitations. This was a byproduct of our metrics system being the first in the TSD as well as the ambiguity caused by the variety of data that DDTS captured. Despite its limitations, the system acted as a catalyst to two important activities. First, it allowed us to revise and enhance the initial requirements while also helping to identify different flavors of the metric that were not initially considered (such as time to fix and defect age). Second, it identified ambiguities, misuse, software process flaws, and data discrepancies, as illustrated in the CTRs-raised-over-time evaluation. The overall process highlighted the iterative nature of developing a metrics system. In implementing a metrics program when software measurement was already in place, we identified an important issue: Because of the wealth of data available, identifying which data will provide the metric desired can be difficult. The converse can equally be true, where measurement is so sparse that no meaningful metrics can be derived.
About the Authors Rob Pooley is a professor of computer science and head of the Department of Computing
and Electrical Engineering at Heriot-Watt University, Edinburgh, Scotland. His interests include software engineering, performance of computer-based systems, and information systems engineering. He received a BSc in economics from the University of Bristol, an MSc in computer science from the University of Bradford, and a PhD in computer science from the University of Edinburgh. He is a member of the British Computer Society. Contact him at Heriot-Watt University, Riccarton, Edinburgh, Scotland; [email protected].
David Senior is a senior development engineer in the Product Generation Unit of Agi-
lent’s Telecom Systems Division. He received a BEng in information systems engineering from Heriot-Watt University in Edinburgh and has since specialized in telecom network configuration for the Agilent acceSS7 product. He is a member of the IEE. Contact him at Agilent Telecomm System Division, South Queensferry, Edinburgh, Scotland; [email protected].
A
user-centered, Web-based metrics system can succeed in an industrial environment where software measurement exists. However, you must convince users of the validity, power, and usefulness of metrics to ensure they use and benefit from the system. We are now working to develop our prototype into a robust management tool and expand it to include other metrics (for example, code measurements, including size and complexity). Measuring code size not only lets us monitor productivity but also lets us develop metrics such as defect density. The TSD and HeriotWatt are also exploring new avenues for developing the use of metrics tools.
Acknowledgments We thank Agilent’s Telecom Systems Group staff and management for supporting this project. The success of this project reflects their guidance and support. Particular thanks go to Alan Abernethy and Alex Tomlinson for crucial assistance, and to Morag MacDonald for making this academic–industrial project possible and agreeing to release the real project data used in all the examples. Agilent Technologies is a separate company formed at the end of 1999 as part of a strategic realignment of the Hewlett-Packard business.
References 1. ClearDDTS User’s Guide, Rational Software Corp., Cupertino, Calif., 1997. 2. R.E. Park et al., Goal-Driven Software Measurement, Software Eng. Inst., Pittsburgh, 1996. 3. M.J. Bassman, F. McGarry, and R. Pajerski, Software Measurement Guidebook, NASA Software Eng. Laboratory Series, NASA Information Center, Washington, D.C., 1995. 4. R.B. Grady and B. Robert, Practical Software Metrics for Project Management and Process Improvement, Hewlett-Packard Professional Books, Prentice Hall, Upper Saddle River, N.J., 1992. 5. R.B. Grady and D.L. Caswell, Software Metrics: Establishing A Company-Wide Program, Hewlett-Packard Professional Books, Prentice Hall, Upper Saddle River, N.J., 1987. 6. M. Glover, A. Humphreys, and E. Weiss, Perl 5: HowTo, Waite Group Press, Corte Madera, Calif., 1996. 7. M. McMillan, Perl from the Ground Up, Osbourne McGraw-Hill, New York, 1998.
Duncan Christie is a process improvement manager in the Product Generation Unit,
Agilent Telecom Systems Division. He received a BEng from Strathclyde University in Glasgow and is a member of the IEE. His experience spans software development, process, and quality management, including roles in the defense, financial, and telecommunications sectors. Contact him at Agilent, Telecomm System Division, South Queensferry, Edinburgh, Scotland; [email protected].
58
IEEE SOFTWARE
January/February 2002
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
focus
software patterns
Is This a Pattern? Tiffany Winn and Paul Calder, Flinders University of South Australia
ithin a given domain, what might appear to be very different problems are often the same basic problem occurring in different contexts. A software design pattern identifies a recurring problem and a solution, describing them in a particular context to help developers understand how to create an appropriate solution. Patterns thus aim to capture and explicitly state abstract problem-solving knowledge that is usually implicit and gained only through experience.
W “Pattern” is an often misused buzzword, perhaps because patterns do not lend themselves to prescriptive, formal definitions. To help software designers understand, use, and write better patterns, the authors propose a set of essential characteristics that can serve as a test for “pattern-ness.”
0740-7459/02/$17.00 © 2002 IEEE
Developers can use that knowledge to solve what appears to be a new problem with a tried-and-true solution, thus improving the design of new software. Recently, the word “pattern” has become a buzzword, and the implicit definition of the pattern concept has become less precise. Defining patterns is tricky, because they are not bound by prescriptive formal definitions. Rather, it is consensus about the existence of particular patterns in a range of existing software that validates them. Yet we still need to develop our understanding of patterns at a theoretical as well as practical level if we are to identify them, use them well, and distinguish them from similarseeming nonpatterns that are described in a pattern-like style.1 We must address this lack of clarity if the pattern concept is to retain its force. For example, many authors agree that
Mediator (an object acting as a go-between for communication between other objects) is a pattern.1,2 But what about algorithms such as Bubblesort and programming techniques such as Extend by Subclassing? On a larger scale, could idiomatic styles of system organization such as Pipe and Filter (pipes connect filters; filters read data from input streams, transform it, and produce data on output streams)3 represent patterns? And what about activities other than program or system design? For example, HotDraw patterns tell HotDraw framework users how to assemble HotDraw components to construct a drawing editor4—but are they really patterns? We propose a list of essential characteristics of patterns. Such a list cannot provide a definitive test for pattern-ness: given a pattern-like entity that exhibits the essential characteristics, we cannot say that it is defiJanuary/February 2002
IEEE SOFTWARE
59
Using a dress pattern to make a dress is like using a design pattern to write a piece of software.
nitely a pattern. However, we suggest that any entity that does not exhibit any one or more of the essential characteristics is not a pattern. What, then, are the essential characteristics of patterns? We have identified nine, each of which is underpinned by the premise that patterns are generative. Architect Christopher Alexander explains, Once we understand buildings in terms of their patterns, we have a way of looking at them which makes all buildings, all parts of a town similar. … We have a way of understanding the generative processes which give rise to these patterns.5 [A pattern] is both a process and a thing; both a description of a thing which is alive, and a description of the process which will generate that thing.5
In other words, a pattern does more than just showcase a good system’s characteristics; it teaches us how to build such systems. 1. A pattern implies an artifact Understanding a pattern means having some sort of picture of the “shape” of the potential artifacts being described. For a piece of software, understanding its shape means understanding ■ ■
at the big-picture level, how the software works; and at the design level, the relationships that the software attempts to capture. James Coplien put it this way: I could tell you how to make a dress by specifying the route of a scissors through a piece of cloth in terms of angles and lengths of cut. Or, I could give you a pattern. Reading the specification, you would have no idea what was being built or if you had built the right thing when you were finished. The pattern foreshadows the product: it is the rule for making the thing, but it is also, in many respects, the thing itself.6
Explaining how to make a dress by specifying a scissors’ route through a piece of cloth is like telling programmers how to write a program by handing over a piece of assembly code. The assembly code might solve the problem, but is unlikely to give them any idea of what they are building. 60
IEEE SOFTWARE
January/February 2002
Nor does it give them a means to evaluate their solution’s correctness or usefulness. All they can do is rote-copy the given assembly code or work in a higher-level language and compare the assembly code produced with that suggested. Using a dress pattern to make a dress is like using a design pattern to write a piece of software. The design pattern does not just show how to create the code at a lineby-line level. It also captures the program’s key overall structure at a higher level, in a more physical or spatial sense. Coplien’s dress pattern example, flow charts and other graphical representations of standard algorithms, and the structural diagrams provided in pattern catalogs7 all highlight the important role of pictures in providing big-picture understanding. Algorithms are often best explained with a combination of text, sample code, and pictures. In the case of Bubblesort, for example, a picture can highlight “lighter” elements “bubbling up” and “heavier” ones “sinking down” as the sort operates. Having gained such big-picture understanding, programmers can better adapt sample code to their specific needs, instead of needing to literally copy or translate the given sample code to use it. In this respect, software patterns are the same as Alexander’s architectural patterns. If a proposed software pattern cannot be drawn, it does not embody a physical understanding of a software artifact’s structure and therefore is not a pattern. 2. A pattern bridges many levels of abstraction A pattern is neither just a concrete, designed artifact nor just an abstract description. Rather, it incorporates design information at many abstraction levels, from sample code to big-picture structure diagrams. A pattern facilitates the progression from a vague idea of “I need some software to do this kind of task” to the actual software itself. It also facilitates standing back from a piece of software and analyzing it at more general levels of design. So, a pattern bridges different abstraction levels and thinking about a problem and its solution. Robert Floyd illustrates what it means to bridge, or link, different abstraction levels in the context of teaching programming:
If I ask another professor what he teaches in the introductory programming course, whether he answers proudly “Pascal” or diffidently “FORTRAN,” I know that he is teaching a grammar, a set of semantic rules, and some finished algorithms, leaving the students to discover, on their own, some process of design. Even the texts based on the structured programming paradigm, while giving direction at the highest level, what we might call the “story” level of program design, often provide no help at intermediate levels, at what we might call the “paragraph” level.8
Linking different abstraction levels means helping designers make connections between different design levels, such as the story, paragraph, sentence, and word levels. For a particular problem, you could include a general overview at the story level, a flowchart detailing control flow at the paragraph level, algorithms at the sentence level, and sample code at the word level. The flowchart and algorithm, for example, work together to help link idea with implementation, and general overview with sample code. In Floyd’s case, he teaches what he calls a standard paradigm for interactive input—prompt-read-checkecho—together with relevant algorithm and sample code, rather than simply providing sample code and leaving students to work out the general paradigm themselves. Design aids such as patterns should bridge different design levels because a designer’s understanding of a problem evolves as the solution develops: The most common information needs in the early stages of development are ill-defined— users don’t know how to solve a problem or where to look for a solution. … As the design unfolds, the designer’s understanding of the problem and potential solutions improves, and he refines and elaborates the problem definition until a satisfactory design emerges.9 [The designer’s] information needs change as the problem context evolves.9
It is important, therefore, to develop design aids that help people move gradually from an initial, general understanding of a problem to a more in-depth one. Further, that ability to link different levels of thinking about a design is critical to knowledge reuse, and knowledge reuse is a key to good design. The challenge in software reuse is not so
much to do more of it, but to recognize which reuse is worth doing: “The challenge in reusability is to express the fragmentary and abstract components out of which complete programs are built.”10 All designers, whether consciously or unconsciously, reuse knowledge by learning from their own and others’ experience. Design patterns facilitate knowledge reuse by capturing implicit and abstract knowledge in a form that lets a range of people share and use it. But designers also need to link abstraction levels to reuse knowledge. They need to recognize and abstract from useful similarities, at possibly any abstraction level, between their own context and another’s. Floyd said it like this:
The challenge in software reuse is not so much to do more of it, but to recognize which reuse is worth doing.
I believe it is possible to explicitly teach a set of systematic methods for all levels of program design, and that students so trained have a large head start over those conventionally taught entirely by the study of finished programs.8 [You should] identify the paradigms [patterns] you use, as fully as you can, then teach them explicitly. They will serve your students when Fortran has replaced Latin and Sanskrit as the archetypal dead language.8
Design patterns do not just present finished programs. They also identify the paradigm or pattern underlying that program. This facilitates software designers’ recognition of the pattern and hence their ability to abstract from their own and other contexts and reuse knowledge. 3. A pattern is both functional and nonfunctional Functional issues deal with possibility; they determine what decisions could be (or were) made in a particular context. Nonfunctional issues deal with feasibility; they address a particular decision’s desirability in a particular context, or the reasons the decision was made. Nonfunctional issues are critical to good design because they help designers balance conflicting forces, and they facilitate design adaptation in the face of change. Design often involves balancing related and conflicting forces. For example, you might sacrifice software readability for efficiency. So, good design requires more than understanding just the forces involved. It also requires understanding the relationships January/February 2002
IEEE SOFTWARE
61
Colleague
Organ set
Colleague
Organ set/ button set mediator
Mediator
Colleague Button set
(a)
IEEE SOFTWARE
Button set/ panel set mediator
(b)
Figure 1. (a) The Mediator pattern; (b) a mediator and two colleagues (circled) in the Prism system.
62
Panel set
between those forces. If those relationships are not explicitly documented, that understanding can be lost, with significant consequences. For example, Douglas Schmidt says that the loss of understanding about conflicting forces “deprives maintenance teams of critical design information, and makes it difficult to motivate strategic design choices to other groups within a project or organization.”11 In contrast, explicitly documenting the rationale for design decisions opens that rationale to criticism, thus facilitating improvement in the design process. Software design is complex and subject to frequent change. Where design is complex, a designer will unlikely be able to grasp the interplay between forces involved simply by observation or intuition. Designers must therefore develop skills, methods, and tools that help clarify that interplay. In a climate where change is frequent and inevitable, an explicit understanding of the reasoning behind design decisions facilitates understanding of the consequences of change.12 This understanding is critical for effective software maintenance and adaptation. Design patterns address both functional and nonfunctional design issues. A pattern is inherently functional because it documents a solution to a problem. It is also nonfunctional because it discusses the feasibility of the solution it documents. Patterns address functional design issues by providing fragments of sample code and diagrams of software structure, and by discussing implementation issues. They highlight nonfunctional issues in many ways. For example, a pattern includes discussion of its applicability. For instance, Schmidt’s Reactor pattern “explains precisely when to apply the pattern (e.g., when each event can be processed quickly)
January/February 2002
and when to avoid it (e.g., when transferring large amounts of bulk data).”11 Although both functional and nonfunctional design issues are important, the interweaving of the two in discussion is what makes patterns so effective. The functional aspects provide a good solution in a given context; the nonfunctional aspects let us more effectively adapt that solution to another context. 4. A pattern is manifest in a solution Where a pattern has been used, either consciously or unconsciously, to solve a particular problem, that pattern will be present and recognizable in the developed solution. Although a pattern does capture an abstract idea, it is not just that. It is also the recognition of the generality of that abstract idea, but explained, understood, and demonstrated in a concrete artifact. A pattern is thus not simply a tool used in a program’s design and then forgotten. It leaves an indelible mark on the finished product because it focuses on both design process and design structure: it is “both a process and a thing.”5 For example, a pattern’s mark is evident in the Prism system for planning radiation treatment programs for cancer patients.2 Prism’s design addressed the problem of developing a highly integrated but easily extendable computer system. Previous systems that needed to be extendable had often been designed to be loosely coupled, because tightly coupled systems were seen as too complex to extend and their behavior too complex to verify. Yet, in some environments—particularly those using integrated systems—a loosely coupled system does not model the tight real-world coupling between the system’s different parts and is therefore difficult and time-consuming to use. Use of the Mediator pattern makes a system such as Prism possible. The pattern structures integrated systems as “collections of visible, independent components integrated in networks of explicitly represented behavioral relationships.”2 In Prism’s case, a set of tumors, a set of corresponding buttons, and a set of panel displays are all independent object sets kept consistent by mediators. The pattern’s manifestation—the mediator objects—is clearly visible. In fact, it is critical to the system’s overall structure, as Figure 1 shows.
All design approaches strive for the same end: the creation of well-designed software. But approaches that focus solely on design process or methodology do not necessarily leave any identifiable imprint. In contrast, approaches such as design patterns that focus on both design process and structure directly influence the product’s visible structure. 5. A pattern captures system hot spots Software systems must remain stable in a highly dynamic environment. They will frequently both change and be subject to external changes. Building a stable software system is not about foreseeing every possible modification. Rather, stability is about understanding a domain well enough to build a system that can evolve appropriately. As Terry Winograd and Fernando Flores said, The most successful designs are not those that try to fully model the domain in which they operate, but those that are “in alignment” with the fundamental structure of that domain, and that allow for modification and evolution to generate new structural coupling. As observers (and programmers), we want to understand to the best of our ability just what the relevant domain of action is. This understanding guides our design and selection of structural changes, but need not (and in fact cannot) be embodied in the form of the mechanism.13
Central to any pattern is an invariant that solves a recurring problem. But any implemented solution varies or evolves with time. Patterns facilitate good design by capturing what Wolfgang Pree calls system “hot spots”14—those parts of a solution likely to change as a developed system evolves. In effect, the pattern captures the invariant and hot spots and provides a structure to manage the interaction between these stable and changing system elements. That structure is critical, because for the invariant part of a system to continue to be invariant in a dynamic environment, the interaction between the invariant and the rest of the system must be carefully defined. In the software domain, patterns isolate expected invariant system elements from the effects of changes to system hot spots. For example, the Composite pattern deals with situations where treating objects and compositions of objects uniformly is desirable.
The invariant is the way objects are structured; the variant is the operation to be performed on the object. Interaction between system elements can be managed by having a Component class, which can have both Leaf and Composite subclasses. Differences between and changes to the Leaf and Composite classes are hidden from their users through the Component class interface.7 6. A pattern is part of a language Every pattern is connected to and shaped by other patterns. Patterns, therefore, are part of a network of interrelated patterns: a pattern language. Alexander explains: No pattern is an isolated entity. Each pattern can exist in the world, only to the extent that it is supported by other patterns: the larger patterns in which it is embedded, the patterns of the same size that surround it, and the smaller patterns which are embedded in it.
Pattern languages are critical because they capture, in some sense, the emergent behavior of complex systems.
This is a fundamental view of the world. It says that when you build a thing you cannot merely build that thing in isolation, but must also repair the world around it, and within it, so that the larger world at that one place becomes more coherent, and more whole.15
Alexander’s architecture patterns are ordered from the largest patterns, for regions and towns, through neighborhoods, clusters of buildings, and so on, down to construction details. When the patterns are combined, they form a language for describing design solutions. An Accessible Green, for example, can be embedded in an Identifiable Neighborhood. It should also help to form Quiet Backs and must contain Tree Places. Doug Lea explains the relationship between patterns at different levels of such a pattern language: Patterns are hierarchically related. Coarse grained patterns are layered on top of, relate, and constrain fine grained ones. These relations include, but are not restricted to various whole-part relations. … Pattern entries are arranged conceptually as a language that expresses this layering.16
The software patterns in Design Patterns7 could also be part of a pattern language. For example, an Abstract Factory could be implemented using Factory Method, which in turn could use Template Method to avoid subclassing. January/February 2002
IEEE SOFTWARE
63
A good pattern language guides designers toward good system architectures: ones that are useful, durable, functional, and aesthetically pleasing.
Pattern languages are critical because they capture, in some sense, the emergent behavior of complex systems: “The combination of patterns acting on a smaller level of scale acquires new and unexpected properties not present in the constituent patterns.”17 A pattern language is thus a collective of solutions to recurring problems, each in a context and governed by forces. The solutions work together at every level of scale to resolve a complex problem. A good pattern language guides designers toward good system architectures: ones that are useful, durable, functional, and aesthetically pleasing.18 In other words, the whole is more than the sum of the parts: a pattern language is more than the sum of its patterns. 7. A pattern is validated by use Patterns are usually discovered through concrete experience rather than abstract argument, although both are possible.5 But a pattern cannot be verified or validated from a purely theoretical framework. In the end, such proof of a pattern’s existence lies in its recurring, identifiable presence in artifacts. In a spoken language, new words are devised, or old words acquire new meanings, through common use. New words can also be created “theoretically”—for example, by combining appropriate word roots—but any newly minted word is not validated unless and until it achieves widespread use. So, too, a pattern’s repeated presence in existing artifacts confirms its usefulness. Theory is important, but to be meaningful it must be evaluated in the context of concrete experience. Consider the technique for solving difficult problems that Floyd outlined in his 1978 Turing Lecture: In my own experience of designing difficult algorithms, I find a certain technique most helpful in expanding my own capabilities. After solving a challenging problem, I solve it again from scratch, retracing only the insight of the earlier solution. I repeat this until the solution is as clear and direct as I can hope for. Then I look for a general rule for attacking similar problems, that would have led me to approach the given problem in the most efficient way the first time. Often, such a rule is of permanent value.8
Floyd’s technique offers insight into what it means for a pattern to be validated by use. 64
IEEE SOFTWARE
January/February 2002
At the point where he has first solved the challenging problem, Floyd might have discovered a pattern. But not until he has discovered a “general rule for attacking similar problems”8 and used that rule in other situations can he call his solution a pattern. Alexander argues that in confirming the existence of architectural patterns “we must rely on feelings more than intellect.”5 He is precise in what he means by this—there is no simple rule with which to verify a pattern’s existence. Confirming the existence of architectural patterns is, therefore, not simply a process of abstract argument; it requires a more intangible mix of theory and practice. Schmidt notes that “Patterns are validated by experience rather than by testing, in the traditional sense of ‘unit testing’ or ‘integration testing’ of software.”11 However, he also states that validating a pattern’s existence by experience is difficult, because it is hard to know when a pattern is complete and correct. His group used periodical reviews of patterns to help with that process. Each pattern in the Design Patterns catalog lists its known uses: examples of the pattern in real systems. This provides a critical check on the pattern’s validity. In contrast, Ralph Johnson uses what he calls patterns to document object-oriented framework use, but gives no evidence that his patterns have occurred in more than one solution— that is, are used by a number of users of the HotDraw framework.4 To validate his patterns, he should give concrete examples of where they have been used. Then, he will have shown that they are useful. 8. A pattern is grounded in a domain A pattern is not an isolated entity. It is defined both in the context of other patterns (a pattern language) and with respect to a particular area or field to which it applies. Discussion of a pattern only makes sense as part of a pattern language. Moreover, discussion of a pattern has no meaning outside the domain to which it applies. For example, Design Patterns describes patterns in the domain of object-oriented software construction, whereas Johnson describes a set of patterns in the domain of framework use. Johnson’s HotDraw pat-
terns work well together, as do those from the Design Patterns catalog, but combining one from each makes no sense. A discussion of patterns must clarify what domain the patterns serve, and it must ensure that all patterns share a common domain. Otherwise, the discussion will likely be confused and confusing. 9. A pattern captures a big idea Design patterns are not about solutions to trivial problems, so not every solution to a software design problem warrants a pattern. Rather, patterns focus on key, difficult problems in a particular area—problems that designers in that area face time and time again, in one form or another. Thus, a pattern language “captures” a domain: together, the patterns in the language identify the domain’s key concepts and the important aspects of their interplay. The elements of other languages exhibit the same effect. For example, the key words of a spoken language—the nouns, verbs, and adjectives that carry much of the meaning in communication—correspond to key objects, actions, and descriptions that occur repeatedly. It is not necessary or even sensible to make up a new word for every concept; instead, we combine existing words in phrases and clauses that convey meaning. Consider the problem, in an OO context, of extending an object’s behavior. Often, the solution is simple—create a subclass with the extra behavior—and does not require a pattern. But this solution does not let existing objects take advantage of the extra behavior, because existing objects will inherit from the (nonextended) base class. If, instead, we add the extension to the base class itself, existing objects can use the new behavior, but the result can be an unwieldy base class that tries to be all things to all clients. The more complex problem of extending an object’s behavior without modifying the base class, such that existing objects that choose to do so can access the extended behavior, does warrant a pattern-based solution. It is a key problem in OO design and is addressed by Erich Gamma’s Extension Object pattern.19 This pattern lets an extension object’s clients choose and access the interfaces they need by defining extension objects and their interfaces in separate classes. A pattern must strike a balance. It must
propose a specific solution to a specific problem. But if the problem it addresses is not significant, the pattern approach’s impact is lost.
D
eveloping a definition that completely captures the pattern concept is neither worthwhile nor possible. But it is possible and worthwhile to document essential characteristics of patterns as a means of clarifying and developing our understanding of the concept and thus our ability to identify and use patterns. How do our characteristics help answer the questions posed in the introduction? In our view, Mediator exhibits all the characteristics—a pattern language for OO software design will likely include this pattern. So, too, we would likely find Bubblesort in a language of algorithm-like patterns and Pipe and Filter in a pattern language for software architecture. Extend by Subclassing, however, falls short because it does not capture a big idea—including it in a language for software design would dilute the impact of more important patterns. Johnson’s HotDraw patterns fall short on two grounds. As we mentioned before, Johnson provides no evidence that they have been validated by repeated use, and they do not clearly identify the domain (framework use or design) in which they apply.
A pattern must strike a balance. It must propose a specific solution to a specific problem.
References 1. E. Gamma et al., “Design Patterns: Abstraction and Reuse of Object-Oriented Design,” Proc. 1993 European Conf. Object-Oriented Programming (ECOOP 93), Lecture Notes in Computer Science, vol. 707, Springer-Verlag, Heidelberg, Germany, 1993, pp. 406–431. 2. K.J. Sullivan, I.J. Kalet, and D. Notkin, “Evaluating the Mediator Method: Prism as a Case Study,” IEEE Trans. Software Eng., vol. 22, no. 8, Aug. 1996, pp. 563–579. 3. M. Shaw and D. Garlan, Software Architecture: Perspectives on an Emerging Discipline, Prentice Hall, Upper Saddle River, N.J., 1996. 4. R. Johnson, “Documenting Frameworks Using Patterns,” Proc. 1992 Conf. Object-Oriented Programming: Systems, Languages, and Applications (OOPSLA 92), ACM Sigplan Notices, vol. 27, no. 10, Oct. 1992, pp. 63–76.
January/February 2002
IEEE SOFTWARE
65
ADVERTISER INDEX JANUARY
/
FEBRUARY
2002
Advertising Personnel James A. Vick IEEE Staff Director, Advertising Businesses Phone: +1 212 419 7767 Fax: +1 212 419 7589 Email: [email protected] Marion Delaney IEEE Media, Advertising Director Phone: +1 212 419 7766 Fax: +1 212 419 7589 Email: [email protected]
New England (product/recruitment) David Schissler Phone: +1 508 394 4026 Fax: +1 508 394 4926 Email: [email protected] Midwest (product) David Kovacs Phone: +1 847 705 6867 Fax: +1 847 705 6878 Email: [email protected] Northwest (product) John Gibbs Phone: +1 415 929 7619 Fax: +1 415 577 5198 Email: [email protected] Southern CA (product) Marshall Rubin Phone: +1 818 888 2407 Fax: +1 818 888 4907 Email: [email protected] Southwest (product) Royce House Phone: +1 713 668 1007 Fax: +1 713 668 1176 Email: [email protected] Japan (product/recruitment) German Tajiri Phone: +81 42 501 9551 Fax: +81 42 501 9552 Email: [email protected]
Sandy Brown IEEE Computer Society, Business Development Manager Phone: +1 714 821 8380 Fax: +1 714 821 4010 Email: [email protected]
Midwest (recruitment) Tom Wilcoxen Phone: +1 847 498 4520 Fax: +1 847 498 5911 Email: [email protected] Mid Atlantic (product/recruitment) Dawn Becker Phone: +1 732 772 0160 Fax: +1 732 772 0161 Email: [email protected] Northwest (recruitment) Mary Tonon Phone: +1 415 431 5333 Fax: +1 415 431 5335 Email: [email protected] Southern CA (recruitment) Karin Altonaga Phone: +1 714 974 0555 Fax: + 1 714 974 6853 Email: [email protected] Europe Gerry Rhoades-Brown Phone: +44 193 256 4999 Fax: +44 193 256 4998 Email: [email protected] Southeast (product/recruitment) C. William Bentz III Email: [email protected] Gregory Maddock Email: [email protected] Sarah K. Huey Email: [email protected] Phone: +1 404 256 3800 Fax: +1 404 255 7942
Advertiser
5. C. Alexander, The Timeless Way of Building, Oxford Univ. Press, New York, 1979. 6. J.O. Coplien, Software Patterns, SIGS Books & Multimedia, New York, 1996. 7. E. Gamma et al., Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, Reading, Mass., 1995. 8. R.W. Floyd, “The Paradigms of Programming,” Comm. ACM, vol. 22, no. 8, Aug. 1979, pp. 455–460. 9. S. Henninger, “Using Iterative Refinement to Find Reusable Software,” IEEE Software, vol. 11, no. 5, Sept. 1994, pp. 48–59. 10. C. Rich and R.C. Waters, The Programmer’s Apprentice, Addison-Wesley, Reading, Mass., 1990. 11. D.C. Schmidt, “Experience Using Design Patterns to Develop Reusable Object-Oriented Communication Software,” Comm. ACM, vol. 38, no. 10, Oct. 1995, pp. 65–74. 12. C. Alexander, Notes on the Synthesis of Form, Harvard Univ. Press, Cambridge, Mass., 1964. 13. T. Winograd and F. Flores, Understanding Computers and Cognition, Ablex Publishing, Norwood, N.J., 1986. 14. W. Pree, Design Patterns for Object-Oriented Software Development, Addison-Wesley, Reading, Mass., 1995. 15. C. Alexander, S. Ishikawa, and M. Silverstein, A Pattern Language, Oxford Univ. Press, New York, 1977. 16. D. Lea and C. Alexander, “An Introduction for ObjectOriented Designers,” Software Eng. Notes, vol. 19, no. 1, Jan. 1994, pp. 39–52. 17. N.A. Salingaros, “Structure of Pattern Languages,” Architectural Research Quarterly, vol. 4, no. 2, 14 Sept. 2000, pp. 149–162. 18. B. Appleton, “Patterns and Software: Essential Concepts and Terminology,” 2000, www.enteract.com/~bradapp/ docs/patterns-intro.html (current Nov. 2001). 19. E. Gamma, “Extension Object,” Pattern Languages of Program Design 3, R. Martin, D. Riehle, and F. Buschmann, eds., Addison-Wesley, Reading, Mass., 1998, pp. 79–88.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
Page Number
Addison Wesley
7
IEEE Pervasive Computing
73
IEEE Software
51
John Wiley
17
MIT
13
OOPSLA Conference 2002 Software Development Conference 2002
Inside Back Cover Back Cover
Classified Advertising
12
About the Authors Tiffany Winn is a PhD student in computer science at Flinders University, where she received her BSc (Hons.) in computer science. Her research interests are in design patterns and programming paradigms. Contact her at the School of Informatics and Eng., Flinders Univ., GPO Box 2100, Adelaide SA 5001, Australia; [email protected]; www. cs.flinders.edu.au/People/Tiffany_Winn.
Paul Calder is a senior lecturer in the School
IEEE Computer Society 10662 Los Vaqueros Circle Los Alamitos, California 90720-1314 Phone: +1 714 821 8380 Fax: +1 714 821 4010 http://computer.org [email protected]
of Informatics & Engineering at Flinders University. His research interests include object-oriented software design, component-based software reuse, graphical interfaces, and data visualization. He is a member of the IEEE Computer Society, ACM SIGCHI, and ACM SIGPLAN. He earned his PhD in electrical engineering from Stanford University, where he was one of the developers of the InterViews user interface toolkit. Contact him at the School of Informatics and Eng., Flinders Univ., GPO Box 2100, Adelaide SA 5001, Australia; [email protected]; www.cs.flinders.edu.au/People/Paul_Calder. IEEE SOFTWARE
January/February 2002
feature
architecture reviews
Making Architecture Reviews Work in the Real World Rick Kazman and Len Bass, Software Engineering Institute, Carnegie Mellon University
software architecture is more than just a technical blueprint of a complex software-intensive system. In addition to its technical functions, a software architecture has important social, organizational, managerial, and business implications.1 This same observation holds of architecture reviews. We can’t simply regard them as technical reviews and ignore their other implications.
A Architecture reviews differ from other technical reviews because of their close relationship to a system’s business goals. Consequently, they should be approached differently, with an eye for nontechnical issues. Here, the authors explore the social, psychological, and managerial issues of formal architecture reviews. 0740-7459/02/$17.00 © 2002 IEEE
Over the past five years, we have participated in over 25 evaluations of software and system architectures in many different application domains, using first the Software Architecture Analysis Method (SAAM)2 and later the Architecture Trade-off Analysis Method (ATAM).3 In discussing these methods in the past, we reported almost solely on their technical aspects. However, developing and refining these methods exposed us to a wide variety of systems, organizations, organizational goals, management styles, and individual personalities. These experiences have forged our opinions—in particular, we are now convinced of the need to explicitly teach and manage the nontechnical aspects of running an architecture review. We thus decided to record our observations and findings on the management, psychology, and sociology of performing architecture evaluations. Getting these factors wrong can doom the best technical effort.
This observation is not particularly an artifact of software development; it holds true of any complex engineering effort: When Brunel and Robert Stephenson were building railways in the 1830s and 1840s, they were expected to involve themselves with raising capital, appearing before Parliamentary Committees, conciliating influential people who might oppose the necessary bill in Parliament, negotiating with landlords over whose land the tracks were to be laid, managing huge gangs of labourers, and dealing with subcontractors. Railways engineers had to be expert in finance, in politics, in real estate, in labour management, and in procurement. Why should we be surprised if software engineers may need to draw on expertise in mathematics, financial analysis, business, production quality control, sociology, and law, as well as in each application area they deal with?4
The point is that these kinds of issues are relevant for anyone who has a business interJanuary/February 2002
IEEE SOFTWARE
67
The difference between an architecture review and a code review is revealed in the question the review is designed to answer.
est in a system’s success—primarily the architects but also the managers, customers, and architecture reviewers. In particular, as architecture reviewers, we continually run into social, psychological, and managerial issues and must be prepared to deal with them. Architecture reviews Since at least 1976,5 reviews have been recognized as an efficient means for detecting defects in code or other artifacts. Why then, is an architecture review worthy of special consideration? In one of our engagements, during a discussion of business goals, one stakeholder turned to another and said, “I told you when you originally brought this up, and I will tell you again—you are a small portion of my market, and I cannot adjust [the topic under discussion] to satisfy you.” The difference between an architecture review and a code review is revealed in the question the review is designed to answer. The code review is designed to answer, “Does this code successfully implement its specification?” The architecture review is designed to answer, “Will the computer system to be built from this architecture satisfy its business goals?” Asking whether code meets specifications assumes the specifications are clear. Often a precondition for a code inspection is a prior inspection of the specifications. Asking whether an architecture satisfies business goals cannot simply assume clarity of the business goals. A system’s business goals will vary depending on the stakeholders’ perspectives and how much time has passed since the system was conceived. Time to market, cost, quality, and function priorities can all change based on events that occur during the initial architecture design. If the business goals aren’t clear, then one (important) portion of the architecture review process is to operationalize the business goals. Who participates in the review? In a code review, only the reviewers (usually three to five) are present, and they focus on the task at hand. In an architecture review, not only are three to five reviewers present but also the architect and, possibly, the architecture team. In addition, because business goals are discussed, a variety of stakeholders must also participate. At one
68
IEEE SOFTWARE
January/February 2002
review, we had 30 to 40 stakeholders in the room. We were reviewing a large system for which the government had contracted, and many government agencies and contractors were developing parts of the hardware and software. As we will later discuss, setting the business goals and having different stakeholders involved causes many logistical problems. Furthermore, in a code review, most of the participants are development professionals who are familiar with the review process. Those who are not can be instructed to study a description of the code inspection process prior to the review. For many stakeholders, the architecture review is their first experience with a review that attempts to match business goals with an architecture, and many are busy managers and professionals who can’t be expected to study a process description. Consequently, we must spend valuable review time explaining the process. Also, having stakeholders who cross organizational boundaries inherently raises the review’s visibility. Managers from multiple groups interested in the system’s success (or failure) are aware of the review and interested in its outcome. What does “success” mean? A client (whose system was behind schedule and over budget) once complained th68at his architecture had already been reviewed multiple times. “I’m tired of being reviewed by amateurs,” he said. “They simply repeat what they were told and have no value added.” Code reviews have as their output a collection of discovered defects. There is ample evidence that discovering defects during a code inspection is more cost-effective than discovering the defects after deployment. Furthermore, there is usually no controversy about whether a defect has been discovered. Someone proposes a sequence of events under which incorrect results will occur to convince the inspection team of the defect. However, the outputs from an architecture review are much more varied—some architectural documentation, a set of scenarios of concern, and list of risks, sensitivity points, and tradeoff points. If the stakeholders do not agree on the form or value of the review’s outputs, then success will be difficult to achieve (or even define).
Reviewing the risks ATAM-based architecture inspections have as their major output a collection of risks, sensitivity points, and tradeoff points. A risk is an alternative that might cause problems with respect to meeting a business goal. Notice the vagueness in this definition: an alternative that “might” cause problems. Because software systems are expensive to construct, it is nearly impossible to gather evidence about development paths not taken. There are four broad categories of outputs that can emerge from an architecture review: Technical risks, which emerge from a SAAM or an ATAM. For example, “Given the current characterization of peak load and the existence of only two Web servers, we might not meet the architecture’s latency goals.” We can mitigate such risks through more analysis, simulation, or prototyping. ■ Information risks, involving areas of the architecture lacking information. There are times during a review when the response to a reviewer’s questions is simply, “We haven’t thought about that.” ■ Economic risks (cost, benefit, and schedule), which are not directly technical in nature but are about dollars and deliverables. “Can we deliver this functionality to our customers by August? Will we lose market share if our performance isn’t quite as good as our competitor’s? Can we build this for less than $80 per unit?” Architectural decisions all profoundly affect these questions. We can mitigate these risks using architecture analysis techniques that focus on these issues (for example, the Cost–Benefit Analysis Method6). ■ Managerial risks, which involve having an architecture improperly aligned with the organization’s business goals.7 Such misalignment can be risky for an organization—regardless of its product’s technical superiority—and might require a costly realignment process. Examples of other kinds of managerial risks include misaligning the architecture’s structure and the development organization’s structure or depending on suppliers whose reliability is unknown or suspect. Each case requires a kind of realignment ■
between the architecture and management or business strategy. Planned versus unplanned Code reviews are usually included in the normal development plan as are architecture reviews. However, unplanned architecture reviews also sometimes occur, either because a stakeholder (typically an architect or manager) becomes interested in the possibility of doing such a review to validate and improve an existing project or because an upper-level manager wants a review to scrutinize a project that is perceived as being in trouble. Unplanned reviews for troubled projects have a certain amount of inherent tension. Everyone involved understands that the stakes are high and the project could be cancelled as a result of the review. Thus, such reviews often result in finger pointing and blaming but unfortunately offer little hope of rescuing the architecture or project. Unplanned reviews also require a certain amount of selling. “What are the reviewers reviewing? What will be the outcomes? How much time will it cost us [the client] in dollars and days?” These are the kinds of questions that the stakeholders ask. They must be convinced that an architecture review will help and not hinder the project or their decision making prior to proceeding.
ATAM-based architecture inspections have as their major output a collection of risks, sensitivity points, and tradeoff points.
Review preparation A code inspection meeting typically inspects about 300 lines of code. Although not all of a system’s code is necessarily inspected, the decision of which code to inspect is made outside of the inspection. Furthermore, a system’s code should be available prior to the review—otherwise, there’s nothing to review. In an architecture review, because its goal is to decide how well the architecture supports the business goals, reviewers can inspect any portion of the system. Furthermore, because the business goals are often ambiguous, we need to dedicate a portion of the review to determining which parts of the system are involved in meeting those business goals. This means that the reviewers must be aware of the entire system’s architecture. Also, the architecture is not always adequately documented—in contrast to code inspections, adequate architectural documentation might not have been prepared. For example, one of our reviews was January/February 2002
IEEE SOFTWARE
69
The review team must have members who are experienced, architecturally savvy, and quick on their feet.
unplanned in nature, and although we were committed to a particular schedule, the architecture documentation was not forthcoming. Finally, two days before the review took place, we received a vast collection of class descriptions said to constitute the “architecture documentation.” That was the basis under which we were forced to conduct the review. As might be expected, the review did not follow the normal procedure and was somewhat ad hoc in nature. Managing the review process Let’s apply these observations about architecture reviews to motivate a set of concerns that a reviewer will face when working through the review process. Different architecture-review techniques have different processes. For this reason, we restrict ourselves to discussing the three broad stages of activity in any review: prework, which involves negotiating and preparing for the review; work, which involves scrutinizing the architectural artifacts; and postwork, which is when you report the review’s results and determine what actions to take. Prework It is easy to pay less attention to the activities that precede a review, because they don’t involve any technical analyses. However, because of an architecture review’s unique nature, if these activities are ignored, the review’s results might be meaningless. If the customer is completely unprepared to be reviewed, it wastes the time of both the reviewers and the team under review. Prework usually involves three main tasks, and although it isn’t always exciting, it ensures that the later work is exciting and meaningful. First, prework involves selling the review. Because reviews tend to originate from different sources within an organization, it is often the case that some of the participants are unhappy with the review. This is particularly the case when the review activity is not part of the normal software process. In such a case, it is often viewed as an intrusion on the stakeholders’ time. So, it is crucial, both before and during the review, to sell management, the architects, and the other stakeholders on the review’s value—that a reviewed system will meet the stakeholder’s goals better than an unreviewed system. Second, it is also important to set expec-
70
IEEE SOFTWARE
January/February 2002
tations. All of the stakeholders should understand the method’s capabilities. In a typical ATAM, you meet with a client for just three days. You need to use this limited time to evaluate the architecture’s ability to achieve its business goals—you will not do detailed technical analyses in any dimension. Finally, you need to decide who should be there for what stages. Some participants will perceive any review, no matter how crucial, as an intrusion on their precious time. Hence, it is important to identify and forewarn the stakeholders and ensure that each of them knows which steps of the process require their presence, and why. In addition, it is often useful to strictly limit attendance in many of the steps to the minimal set of stakeholders needed, because this typically makes for a more efficient and productive use of time. As mentioned earlier, there are times when 30 or 40 stakeholders will want to attend, perhaps to impress their bosses or sponsors, further their own agendas, or bill for extra time. Whatever the reason, their agendas should not dilute the review’s effectiveness. As part of the standard set of ATAM materials, we have prepared a presentation designed to support these three goals. The presentation explains the evaluation’s goals and process as well as its costs and benefits. This presentation is useful both for those who are going to participate in the evaluation and those whose concurrence is needed for the evaluation to proceed. Work During an architecture review, we not only perform the steps of the review process but also act as facilitators. This is where the reviewer’s personal qualities (mentioned earlier) are important. The review team must have members who are experienced, architecturally savvy, and quick on their feet. Technical rationale During an architecture review, we should be able to establish business goals, rely on the architect, and identify risks. An effective architecture review process presents the system’s business goals to the reviewers and stakeholders and identifies and prioritizes a set of scenarios that represent concrete specifications of these goals. For example, if one business goal is for the system to be long-lasting, then modification
scenarios become a high priority. These high-priority scenarios determine the evaluation’s focus. Because the architect identifies how the architecture will satisfy such scenarios, he or she is critical to making the architecture understandable. Each scenario begins with a stimulus—an event arriving at the system, a fault occurring, or a change request being given to the development team. The architect walks through the architecture, explaining what is affected by the stimulus—for example, how the system will process an event or detect a fault, or what components will be affected by a change. The architect will also review how a response to the stimulus will be made—the recovery mechanisms needed and which modules will change when a change request is addressed. Each scenario reflects specific business goals. To ensure that these business goals are adequately met, part of the evaluators’ due diligence is to understand the architect’s explanations of how the architecture will respond to the stimulus and how any identified risks will be dealt with. Risks are identified as a result of three possible conditions: the explanation of how the architecture responds to the stimulus is not convincing, business goals other than the ones reflected in the scenario are violated, or fundamental decisions have not yet been made. Usually, when an architecture review is held, not all architectural decisions have been made. However, this becomes a risk when too many decisions have yet to be made or when the indecision threatens business goals. Social behavior To best use the stakeholders’ time during the review, the team needs to 1. 2. 3. 4. 5. 6.
control the crowd, involve the key stakeholders, engage all participants, maintain authority, control the pace, and get concurrence and feedback.
Crowd control is critical. At the start of the review meeting, establish how and when people may interact with each other. For example, it is important to avoid disruptions, so side conversations, cell phones, and pagers should be banned. Establish at the
outset whether people can come and go, and when they absolutely must be present. Latecomers should not expect the proceedings to stop so they can be updated. Holding the review meeting away from the home site of the team being reviewed is an effective way of minimizing interruptions. Involving key stakeholders is also important, because any activity that involves the entire group interacting also helps achieve buy-in. As each point important for the design or analysis emerges, record it visibly and verify that the recording is correct. Participants then feel they are helping in the process; it isn’t just the reviewers reviewing the architecture—it’s the entire group discovering aspects of the architecture. For example, in one review, the architecture team (and the lead architect, in particular) were skeptical of the review’s value and hence initially participated only grudgingly. Throughout the review, and as insights into the system emerged, we convinced the architect that this activity was for everyone’s benefit and that we were not there to point fingers but rather to improve the architecture and hence the resulting system. By the end of the review, the architect was an enthusiastic participant. In addition to involving the key stakeholders, it is important to engage all participants. It is not uncommon, in any kind of meeting, to have people who dominate the airwaves or people who are shy about participating. For example, some people might be reluctant to speak frankly in front of their bosses or their subordinates. For these people, it is important to provide a forum in which they are either alone or only among peers. It is also important to provide a mix of free-for-all participation and periods where each person has a dedicated “safe” time to speak. Of course, despite your best efforts, there might be times when an evaluation gets out of control: people will have side conversations, try to steal the agenda, or resist providing information. The review team needs to know who has ultimate authority if people are being disruptive. Is it the review team leader, the customer, a specific manager, or the architect? The review team facilitator should have a strong (but not dogmatic) personality and should be able to manage the group dynamics through a combination of humor, appeal, and authority.
Because the architect identifies how the architecture will satisfy such scenarios, he or she is critical to making the architecture understandable.
January/February 2002
IEEE SOFTWARE
71
When the review is complete, there must be some agreement on how to communicate the outputs back to the stakeholders.
72
IEEE SOFTWARE
Similarly, the team architecture review facilitator must be able to control the pace. Any meeting with a diverse group of (likely highly opinionated) stakeholders will range out of control from time to time, as mentioned. But sometimes these conversations are revealing—hidden agendas, worries, future requirements, past problems, and a myriad of other issues come to light. The facilitator must be aware of these digressions and know both when to squelch them and when to let a conversation continue. A portion of the time allocated is reserved for the reviewers to prepare their briefing. This same time can be used for offline meetings among the stakeholders. This kind of facilitation can be exhausting: assimilating huge amounts of information, looking for problems, and managing all of the political and personal issues simultaneously. Thus, we have found that it is useful to have two facilitators and to switch between them periodically to give each a mental break. Finally, be sure to obtain concurrence and feedback. The activities and information generated in a review are not really directed at the review team, even though the review team is frequently the focus of the conversation and the source of many of the probing questions. The review’s outputs are really for the stakeholders—the review team members are just there to act as catalysts, experts, and facilitators. Because of this mismatch between the producers and consumers of the information and the way that the information is elicited (through the facilitation of the review team), extra care must be paid to ensure that all stakeholders concur with whatever is recorded. In our reviews, for example, we typically record information on flip-charts in real time as well as on a computer—for later presentation or out-briefing and for the final report. When we put information up on a flip-chart, we have a golden opportunity to ensure that the information recorded is correct. Thus, we endeavor to get concurrence as items are posted and ensure that we keep flip-charts visible around the room so that they are always available for consultation, correction, and refinement. Architects are human and welcome encouragement and positive feedback. A review is mostly concerned with finding problematic decisions, so it is useful for the reviewers to occasionally
January/February 2002
make positive comments about particular architectural decisions. This helps alleviate the generally negative questions and sets a more positive tone. Postwork Reviews of all kinds are part of a mature organization’s software process, but the time and trouble involved are wasted unless there is a predefined output and a determination of who will act on the results. Otherwise, the review’s outputs will end up buried on someone’s desk and become a low-priority item. Thus, postwork activities must report the outputs and follow up on the review’s results. When the review is complete, there must be some agreement on how to communicate the outputs back to the stakeholders— in particular, to management and the architecture team. In the ATAM, we give slide presentations to the stakeholders immediately after the review and then, some weeks later, we deliver a formal and more extensive written report. Regardless of the result’s form, the review team must know who gets told what and when. For example, can the architects respond to the review report before it goes to management or other stakeholders? A slide presentation or report from a review can have many possible destinies. In our experience, the review’s outputs (lists of scenarios, architectural styles, risks, nonrisks, sensitivities, and trade offs) validate existing project practices, change existing project practices (such as architecture design and documentation), argue for and obtain additional funding from management, and plan for future architectural evolution. In one case, a participant, the day after the review, used the outbrief to convince management that he needed more resources for the project—an unsuccessful appeal prior to the review. It is thus important to consider the report’s goal and to plan accordingly.
T
he time is ripe for the widespread adoption of architecture reviews as a standard part of the software engineering lifecycle. Materials to teach and support the practice of architecture reviews have been increasing in both quantity and
NEW FOR 2002 The exploding popularity of mobile Internet access, third-generation wireless communication, and wearable and handheld devices have made pervasive computing a reality. New mobile computing architectures, algorithms, environments, support services, hardware, and applications are coming online faster than ever. To help you keep pace, the IEEE Computer Society and Communications Society are proud to announce IEEE Pervasive Computing. This new quarterly magazine aims to advance pervasive computing by bringing together its various disciplines, including
• • • • •
Hardware technologies Software infrastructure Real-world sensing and interaction Human–computer interaction Security, scalability, and privacy
Led by Editor in Chief M. Satyanarayanan, the founding editorial board features leading experts from UC Berkeley, Stanford, Sun Microsystems, and Intel.
VISIT IEEE Distributed Systems Online, Pervasive Computing’s online resource. DS Online offers expert-moderated information on topics including mobile & wireless computing, distributed agents, and operating systems.
Don’t miss the premier issue
SUBSCRIBE NOW!
computer.org/dsonline
quality, including a book devoted entirely to architecture reviews.3 Many large corporations are now adopting architecture reviews as part of their standard software engineering development practice, and some are even including these reviews as part of their contracting language when dealing with subcontractors. In addition, the scope of these reviews is growing to include far more than just the technical issues. As we have stressed, when dealing with an architecture, business issues are the driving factors in design. By considering the relations between business goals and architecture, and considering them early in the software development (or redevelopment) process, they might be dealt with in a way that provides the greatest benefit for the system’s many stakeholders. References 1. L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice, Addison-Wesley, Reading, Mass., 1998. 2. R. Kazman et al., “SAAM: A Method for Analyzing the Properties of Software Architectures,” Proc. 16th Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 1994, pp. 81–90. 3. P. Clements, R. Kazman, and M. Klein, Evaluating Software Architectures: Methods and Case Studies, Addison-Wesley, Reading, Mass., 2001. 4. M. Jackson, Software Requirements and Specifications, Addison-Wesley, Reading, Mass., 1995.
http://computer.org/pervasive 5. M.E. Fagan, “Design and Code Inspections to Reduce Errors in Program Development,” IBM Systems J., vol. 15, no. 3, 1976, pp. 182–211. 6. R. Kazman, J. Asundi, and M. Klein, “Quantifying the Costs and Benefits of Architectural Decisions,” Proc. 23rd Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 2001, pp. 297–306. 7. J. Henderson and N. Venkatraman, “Strategic Alignment: Leveraging Information Technology for Transforming Organizations,” IBM Systems J., vol. 32, no. 1, 1993, pp. 4–16.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
About the Authors Rick Kazman is a senior researcher at the Software Engineering Institute of Carnegie Mellon University and an adjunct professor at the Universities of Waterloo and Toronto. His primary research interests are software engineering (software architecture, design tools, and software visualization), human–computer interaction (particularly interaction with 3D environments), and computational linguistics (information retrieval). He received a BA and M.Math from the University of Waterloo, an MA from York University, and a PhD from Carnegie Mellon University. His book Software Architecture in Practice (written with Len Bass and Paul Clements, Addison-Wesley, 1998) received Software Development Magazine’s Productivity Award. His most recent book is Evaluating Software Architectures: Methods and Case Studies (written with Paul Clements and Mark Klein, Addison-Wesley, 2001). Contact him at [email protected].
Len Bass is a senior researcher at the Software Engineering Institute of Carnegie Mellon
University. He has written or edited six books and numerous papers in a wide variety of areas of computer science, including software engineering, human–computer interaction, databases, operating systems, and theory of computation. He received his PhD in computer science from Purdue University. Contact him at [email protected].
January/February 2002
IEEE SOFTWARE
73
feature
programming languages
UML-Based Performance Engineering Possibilities and Techniques Evgeni Dimitrov and Andreas Schmietendorf, Deutsche Telekom Reiner Dumke, University of Magdeburg
Design largely determines an information system’s performance efficiency. Performance xamining performance at the end of software development is a common industrial practice, but it can lead to using more expensive and powerful hardware than originally proposed, time-consuming tuning measures, or (in extreme cases) completely redesigning the application. This type of subsequent improvement always involves the additional costs associated with improving performance and not having systems available on time.
E Performance engineering considers a system’s performance from the earliest phases of development. The authors review UML-based PE, analyze corresponding approaches, and examine the current tools available. 74
IEEE SOFTWARE
To avoid such costs, a system’s performance characteristics must be considered throughout the whole software development process. Performance engineering (PE) provides a collection of methods to assure appropriate performance-related product quality throughout the entire development process.1 Distributed applications based on new technologies such as the Internet, Java, and Corba must explicitly take performance characteristics into account because these technologies directly affect a company’s value chain. The basic challenge with such systems is how to manage and minimize risk during the software’s entire life cycle rather than waiting for the system test phase. Integrating PE into system development and the procedural model right from the start is a step toward meeting this challenge. To implement PE methods and tools, you need a plan to integrate PE into the software
January/February 2002
life cycle that includes ■ ■
■
■ ■
a clear description of the necessary PE processes; an integrated approach for realizing PE based on the Uniform Modeling Language,2 because it is now an established software standard; a proposal for an organizational structure within the company that will be responsible for implementing PE; derivation of the necessary expenditures required to carry out the PE tasks; and initiation of a company-wide flow of information to exchange performancerelated data.
This article analyzes the UML-based approach to performance modeling. We will show how to work out suitable requirements, analyze existing approaches, and 0740-7459/02/$17.00 © 2002 IEEE
summarize the possibilities for tool support. The Object Management Group is also considering these requirements and is working on extending UML to consider performance aspects. For this, the OMG has introduced the corresponding Request for Proposal, “UML Profile for Scheduling, Performance, and Time.”3
Although approach 2 seems to be best for the developer, we must assume (based on current knowledge) that only approaches 1 and 3 will provide approaches that we can use in engineering terms. Several requirements for a UML-based approach provide a framework for performance modeling:
Requirements for a UML-based approach Object-oriented modeling offers the possibility for shared data and function modeling. Using UML notation (which is largely standardized) provides a suitable initial basis for building performance models. The aim of performance modeling is to verify the effects of architecture and design solutions in terms of their performance behavior; the performance behavior at the user interface is of interest in terms of response time and throughput. Although we can implement the first requirement without directly considering hardware and software, a performance analysis of the entire system means examining both. When studying the methods available for software engineering, we must distinguish between those used for modeling real-time systems and conventional systems. For realtime systems, developers have explicitly considered performance behavior by using expanded Petri networks, a Specification and Description Language (SDL), or theory models. This is not possible for conventional systems without taking further action. The difficulty lies in the complexity and abstraction of the underlying layers and the interactions between software systems, which are sometimes impossible to follow. Furthermore, we generally model such systems using informal languages such as UML, which makes the transfer to performance models harder. We analyze three UML-based approaches of PE in this article:
■
1. Directly representing performance aspects with UML and transferring effective model diagrams into corresponding performance models. 2. Expanding UML to deal with performance aspects. 3. Combining UML with formal description techniques (FDTs) such as SDL and Message Sequence Charts (MSCs).
■
■
■
a UML modeling style that uses UML diagrams for performance modeling, automated (where possible) generation of performance models to analyze the model’s performance characteristics, clear delimitation between different architectural aspects, such as separate representations (diagrams) for application, infrastructure, and distribution aspects, and support for iterative and incremental software development.
Approach 1: Direct representation of performance aspects using UML With this approach, the aim is to use UML concepts and elements—without adding nonconforming UML elements—to represent and interpret performance aspects. The only permitted extensions are Object Constraint Language (OCL) conformity constraints, stereotypes, and tagged values to particular UML elements so that we can derive and record performance information.4 This section describes using major UML concepts within the framework of a PE project. We deliberately rejected using model elements in UML that provide no additional value for PE tasks. Use case diagrams Use cases describe user behavior (either human or machine). In particular, a use case diagram models the system’s external stimuli (see Figure 1). Actors serve here as a basis for defining the system’s workload. It is not particularly useful to represent each system user with an actor. Instead, the actors should be assigned several roles (or single roles) to represent a specific part of the system’s workload. We can use a process based on ISO 14756 for a formalized description of the workload within the framework of use case diagrams. This requires many steps: January/February 2002
IEEE SOFTWARE
75
Figure 1. A load- and time-weighted use case diagram. Actor1 TT1 {q = 40%} {UPT = 3 sec}
TT3 {q = 10%} {UPT = 3 sec} Search
{tRef =0.5 sec} {BRef = 500TT1/h}
Actor2
{tRef = 5 sec} {BRef = 100TT3/h}
Search by name <<extend>>
TT2 {q = 30%} {UPT = 1 sec} Insert
■
■ ■
■
Identifying activity types (ATs) and recording the number and type of users. Deriving task types (TTs) by assigning ATs to task mode and defining servicelevel requirements. Defining chain types as fixed sequences of TTs. Defining the percentage occurrence frequency q and the user preparation time (UPT) for concrete chain types while the program runs. Laying down the reference value of mean execution time tRef (response times) for concrete task types in the form of tRef for TTj should take place at x% in L n sec. and y% L n + 5 sec., at x% + y% = 100%, BRef as throughput for TTj derived from tRef.
Use cases described in this way are not directly used to derive performance information. That’s possible only if they are supported through scenarios and related interaction diagrams. However, the load model described provides the ideal conditions for producing performance models. We can also use it as an input variable for corresponding benchmark tests that use load driver systems. Furthermore, individual use 76
IEEE SOFTWARE
January/February 2002
<<extend>>
Actor3 TT4 {q = 20%} {UPT = 5 sec}
<>
{tRef = 1 sec} {BRef = 2,000TT1/h}
■
Search by number
Modify {tRef = 0.5 sec} {BRef = 100TT4/h}
cases can become weighted through timing constrains. Interaction diagrams Sequence diagrams as special interaction diagrams offer sufficient potential to obtain and present performance information. They represent time relations by introducing an explicit time axis (time progresses from top to bottom). The vertical layout of messages in the diagram helps define the messages’ chronological sequence. Normally, the time axis is scaled on an ordinal basis—for example, the size of the vertical distances is only significant for positioning. A rationally scaled time axis is also permitted for modeling performance aspects and real-time processing. We can add additional time information by labeling the messages with relative constraints. We can assign time attributes to the messages (horizontal lines) and to the method execution (vertical bars—see Figure 2). In the former, the time given is interpreted as latency; in the latter, it specifies the time for method execution. In addition to individual time attributes, setting comprehensive constraints on a group of nested methods is also useful. Implementation diagrams Implementation diagrams show the software components implemented, their rela-
AppServer
User
DB server
Figure 2. Representation of time information in a sequence diagram. The state marker offers the possibility to set up a reference between sequence and state diagrams. Consequently, it’s possible to follow a state transition within the sequence diagram.
DB
Login {3 s}
Acknowledge
Check login {150 ms}
State marker
Send modify request Preparation user request {120 ms} Send search request Preparation search request {100 ms} Connect DB
Return name list
Search by name {150 ms}
Return name list Return name list
Acknowledge name Preparation modify data {120 ms} {2 s} Send modify data Connect DB
Send acknowledge Send acknowledge
Store modify data {150 ms}
Send acknowledge
tionships with each other, and their assignment to individual nodes in a hardware topology. We can use them to define hardware resources and their performance behavior or to quantify the resource requirement of concrete software components (for example, the application server in Figure 3 needs a Windows NT system with a CPU performance of 100 SPECint95). We can also easily use them in a queue model. With this option, components are shown using services and relationships between components (links) that in turn use queues with a finite operating throughput. An open network models the actors that represent the workload.5
State and activity diagrams State diagrams show the relationship between a particular class’s events and states and thus offer a direct link with Markov modeling. However, the main problem here is determining transfer rates for the system—an area that does not yet have a sufficiently strong scientific basis. Developers often use state diagrams to model the workload of e-commerce systems.6 Applying the so-called customer behavior model graph shows possible user demands on the system as a result of state transitions (see Figure 4). A probability is assigned to each transition, and for every user type, a corresponding CBMG must be January/February 2002
IEEE SOFTWARE
77
Figure 3. Performance aspects within a deployment diagram.
CPU performance {100 SPECint95} Main memory {512 Mbytes} Disk I/O {20 Mbits/s - SCSI II} Network interface {100 Mbits/s}
ApplicationServer:WindowsNT
CPU performance {5,000 tpmC} Main memory {1,024 Mbytes} Disk I/O {80 Mbits/s - FibreChannel} Network interface {100 Mbits/s}
TCP/IP {10 Mbits/s}
DB access
DatabaseServer:SUNSolaris Relational database
Customer application logic CPU performance {60 SPECjvm98} Main memory {256 Mbytes} Network interface {10 Mbits/s}
<>
Legend: • Performance specifications valid for defined workload X • <> restrictive stereotype shows the required 64-Kbits/s client connection • Decorative stereotypes used for visual presentation of the client and the database server (non-UML standard) Client:JVM
constructed. In this case, we can also assign corresponding thinking times to the state transitions. With the CBMG’s help, it’s possible to derive extensive information about user behavior—for example, the average number of calls of a state within the CBMG or the average length of a user session.
Figure 4. The customer behavior model graph. The numbers assigned to each transition show the probability of a state change.
0.1
Exit
0.4 State B 0.5
0.5
State A
0.5 0.3
State C
0.5
0.2
State D 0.35
78
IEEE SOFTWARE
January/February 2002
0.15
Exit
0.5
Exit
Activity diagrams are defined in the UML metamodel as a specialized form of state diagrams. They describe sequences or processes that are generally used relatively early in the development process. They are similar to Petri networks and can help express competitive behavior with explicit synchronization mechanisms. Possible examples for using activity diagrams are a description of a use case that consists of several steps (Figure 5a) or a representation of the interaction of several use cases (Figure 5b). However, at the moment it is not yet clear how to use activity diagrams directly within the scope of PE and what use they will be (in concrete terms) for performance modeling. Combining state and collaboration diagrams forms a practical basis for simulating UML models. Embedding an object’s state diagram into the collaboration diagram in which the corresponding object is represented creates this combination. Although several prototypes have emerged, we still a need for suitable simulation tools to simulate such diagrams.5, 7
Figure 5. Representation of time requirements within activity diagrams.
Use case 1 State 1
State 2
{12 s}
State 3 Insert {20 s}
Search by name {10 s}
Modify {50 s}
Update {10 s} [else]
State 4 {5 s}
[else] State 5 [No further use cases exist]
(b)
{3 s}
State 6
State 7
(a)
Approach 2: Expanding UML to deal with performance aspects A common requirement of all real-time systems is correct time behavior. The planned software’s architecture is extremely important here. You can design a good architecture by using the Real-time Object-Oriented Modeling method, which developers have used for 10 years.8 To convert ROOM constructs to UML, you can use the standard UML tailoring mechanisms, which are based on stereotypes, tagged values, and constraints.9 The resulting UML expansion is often described as real-time UML (or UML-RT). Real-time UML There are three main constructs for modeling such a structure:
■
■ ■
Capsules, which are complex physical architectural objects that interact with their environment through one or more signal-based interface objects. Ports, which represent a capsule’s interface objects. Connectors, which represent abstract views of communication channels.
An internal network of subcapsules working together represents complex capsules. These subcapsules are in turn capsules in themselves and can break down into subcapsules again. Capsules and ports are modeled in UML as the stereotype of a corresponding class with the stereotype name <> or <<port>>. An association models a connector and declares it in a collaboration diagram of the capsules by the January/February 2002
IEEE SOFTWARE
79
duction, execution time, dwell time, flat time, rise and fall time, and jitter.11 Use case, activity diagrams
Conception UML / MSC Class, state, interaction diagrams
Conceptual model
Analysis UML Class and state diagrams
Analysis model
"Early" design UML
Design model Translate
Detailed design SDL
Figure 6. Forward engineering for combining UML, MSC, and SDL.14
role name given for the association. Behavior is modeled on the basis of protocols for specifying the required behavior at a connector and special state machines that are assigned to capsules. A time service is provided to represent time aspects. A service of this type, which can be accessed through a standard port, transforms time into events, which can then be treated in exactly the same way as other signal-based events. Expanded UML UML is often expanded through the introduction of additional expressions and constructs to appropriately model real-time and embedded systems.10,11 Three new types of diagram are important here. Timing diagram. We can use timing diagrams, which are widely used in electrotechnical systems, to explicitly model changes in state over time. The horizontal axis shows the time, and the vertical axis shows various system states. The time axis is normally linear—thus, these diagrams represent a suitable supplement to the UML state diagrams in terms of time. Timing diagrams basically exist in two forms:11 ■ ■
a simple timing diagram represents a single path within a state diagram, and a task timing diagram represents the interaction of several tasks over time.
The most important elements of timing diagrams are period, deadline, time of intro80
IEEE SOFTWARE
January/February 2002
Concurrency diagram. The collaboration diagram considers interobject communication in a task-unrelated view. However, there are three forms of object-to-object communication in a multithreaded environment: intratask, intertask–intraprocessor, and intertask–interprocessor. Collaboration diagrams produce the first form, but they are not suitable for modeling the other two. For them, we need concurrency diagrams.10 These diagrams depict the logical object architecture in a multitasking solution, meaning we can depict the task structure of the solutions as well as the modeling of the communication mechanisms between the tasks. System architecture diagram. This type of diagram gives a detailed representation of the physical system structure, particularly for software distribution over nodes. (UML deployment diagrams are not suitable for this, although they can show some aspects of physical distribution.) The most important elements of such a diagram are processing nodes (or boards), storage nodes (disks), interface devices, and connections. We can thus take into account specific implementation characteristics of these elements such as storage capacity, execution speed, bandwidth, and further quantitative elements. Approach 3: Combining UML with formal description techniques We can distinguish between two ways of combining UML with MSC and SDL:12,13 ■
■
Link approach, where some submodels are shown in UML and others in SDL; links create the relationship between them. Forward engineering, which means translating UML to MSC and SDL.
The link approach uses UML and SDL techniques in the same phase of the development process and involves major consistency problems. In the forward engineering approach (see Figure 6), UML is used in the conception and analysis phases and to some extent in early design. SDL, on the other hand, is
Table 1 An evaluation of selected UML-based tools
Support of Approach 1 (native UML)
Support of Approach 2 (expanded UML)
Tool and company
Type of diagram
Solution method supported
Model derivation
PROSA/Insoft Oy
All UML diagrams
Simulation
SPE.ED/L&S Computer Technology Rose-RT or ObjectTime/Rational Rhapsody/I-Logix
Use case, MSC, and sequence diagrams All UML diagrams plus MSC Sequence, class, use case, and state diagrams All UML plus constraints, concurrency, and system architecture diagrams; timing table All UML plus ASN.1, SDL, MSC, TTCN All UML plus ASN.1, SDL, MSC
Queue model, simulation (CSIM) Simulation, debugging
State and collaboration diagrams Use case, MSC, and sequence diagrams Angio Traces derived from scenarios Sequence and state diagrams Sequence and state diagrams
ARTiSAN Studio/ ARTiSAN Software
Support of Approach 3 (combined UML with FDTs)
Tau 4.0/Telelogic ObjectGEODE/ Verilog (Telelogic)
stronger in the later design and implementation phases, including automatic code generation. We use the MSCs to model and simulate use cases and interaction scenarios. Predefined transformation rules exist for translating UML semantics to SDL and MSC.14 However, fully automatic translation is not always a good idea, because the individual classes and associations in the UML model can be transformed to an SDL–MSC model in different ways. In their standard forms, both MSC and SDL are not sufficient to derive complete performance information. Expansions might solve this problem, which we can divide into two groups: ■
■
Expansion of the SDL–MSC syntax, which we can do by adding further language constructs to describe real-time requirements and performance characteristics.15,16 Annotative expansion of SDL and MSC, which we can achieve through comment instruction.17
One of the first approach’s disadvantages is the need to realize independent SDL–MSC development environments. Commercial tools do not allow any language expansion. The second approach is associated with a series of assumptions that present a problem
Animation, simulation Animation, simulation
Animation, simulation Simulation
Sequence and state diagram; SDL, MSC Expanded SDL, MSC
Integration into process model
No No Yes No No
Yes Yes
for practical application.17 The expansion of the SDL standard and the standardization of the UML–SDL connection mean that the significance of combining these two techniques is certain to increase in the future. Tool evaluation Table 1 contains only those tools that allow UML-based PE. This means that the tools explicitly support approach 1, 2, or 3 described in this article. Widely used UML modeling tools such as Rational/Rose, Together, Paradigm Plus, System Architect, or StP/UML are not listed here because they do not permit any derivation of performance characteristics or information. The information in the table and the evaluations undertaken are based mainly on the product manufacturers’ information. Existing tool support is extremely inadequate at the moment. This means either that the tool support for Approach 1 is limited to tools that come from the academic arena, or that we must transfer from software to performance models manually. The group of tools that support UML expansions (Approach 2), especially for realtime and embedded systems, contains several professional development environments. The lack of standardization of the expansion January/February 2002
IEEE SOFTWARE
81
Use case diagrams Sequence diagrams
Queing models QN (Extended QN, layered QN, ...)
List of scenarios
Class diagrams State diagrams
Petri Net based models (Timed PN, stochastic PN, ...)
SDL models Blocks
Message sequence charts
Analytical methods
Processes
Simulation methods
Tool support for UML modeling and performance analysis
Prototypical benchmarks (new techniques with unknown performance behavior)
Deployment diagrams
Hybrid methods Exchange of performance-related informations (support of model validation and verification)
Figure 7. The raw architecture of a a UML-driven framework.
concepts makes these harder to use. In addition, most of them are specifically directed at the target area—real-time and embedded systems—and allow the development of applications for specific hardware platforms. These are thus less suitable for use in classic information systems. The tools that support a combination of UML and FDTs (Approach 3) are the most highly developed in comparison with the other two groups of tools. They have been around longer, having appeared on the market as pure SDL–MSC tools at the end of 1980. In recent years, they have successfully expanded in the direction of UML modeling.
E
xisting UML-based approaches do not satisfactorily meet the requirements defined earlier. Carrying out PE tasks using the UML constructs that are currently standard, especially in the context of classical information systems, seems insufficient for the practical use of UML for performance modeling and evaluation. For the representation and analysis of performance aspects within the UML framework, a
82
IEEE SOFTWARE
January/February 2002
structural expansion of this notation is absolutely essential. In addition to this expansion, combining UML with formal specification techniques such as MSC and SDL is a promising approach, especially for real-time and embedded systems. However, once again, standardization is essential. Only the combination of UML-models with FDT and classical performance models such as queuing networks supported by a prototypical benchmark allows the definition of an efficient performance modeling approach. Here’s what a first proposal of such an integrated approach should look like: ■
■ ■
Starting point: extensions of the UML notation for the representation of performance-related aspects. Standardization is required. Early development stages: application of waiting queue- or net-based models. Late development stages: application of tool-based solution techniques or a combination with FDTs.
We can use different methods to solve per-
formance models: analytical, simulative, or both. Figure 7 summarizes this approach. We evaluated our proposed direct representation of performance aspects using UML with a case study of a prototype, CoNCCaT (component-oriented network centric computing architecture telekom). The prototype was developed within Deutsche Telekom and aimed to collect experience and investigate the opportunities for applying new software technologies. The prototype itself represents a multitier client-server architecture based on a client, an application server, and a database server. To describe the necessary architecture, we used an extended deployment diagram to show the required performance behavior of the basic components. Based on this kind of performance description, deriving executable performance models such as modified time-augmented Petri nets or queuing networks is possible.18 Execution of the models lets us discover several performance aspects of the whole system and identify potential bottlenecks. In such a way, the practical relevancy of the proposed approach is underlined. References 1. C. Smith, “Performance Engineering,” Encyclopedia of Software Eng., J.J. Maciniak, ed., John Wiley & Sons, New York, 1994, pp. 794–810. 2. G. Booch, I. Jacobson, and J. Rumbaugh, The Unified Modeling Language User Guide, Addison-Wesley, Reading, Mass., 1998. 3. Object Management Group, Profile for Scheduling, Performance, and Time: Request for Proposal, OMG document number ad/99-03-13, Object Management Group, 1999. 4. J. Warmer and A. Kleppe, The Object Constraint Language, Addison-Wesley, Reading, Mass., 1999. 5. R. Pooley and P. King, “The Unified Modeling Language and Performance Engineering,” IEEE Proc. Software, vol. 146, no. 2, Mar. 1999. 6. D.A. Menasce et al., “Resource Management Policies for E-Commerce Servers,” ACM Sigmetrics, vol. 27, no. 4, Mar. 2000, pp. 27–35. 7. H. Lehikoinen, “UML Modeling and Simulation Raises Quality and Productivity,” Proc. Component Computing ’99, Tieturi, Helsinki, 1999; www.insoft.fi (current Nov. 2001). 8. B. Selic, G. Gullekson, and P. Ward, Real-Time ObjectOriented Modeling, John Wiley & Sons, New York, 1994. 9. B. Selic and J. Rumbaugh, “Die Verwendung der UML für die Modellierung komplexer Echtzeitsysteme (Using UML for Modeling Complex Real-Time Systems),” ObjektSpektrum, vol. 4, 1998; www.objecttime.com (current Nov. 2001). 10. Artisan Software, How Can I Use UML to Design Object-Oriented Real-Time Systems when It Has No Support for Real-Time?, Artisan Software, Portland, Ore., 1999; www.artisansw.com (current Nov. 2001). 11. B.P. Douglass, Real-Time UML: Developing Efficient
12. 13.
14.
15.
16.
17.
18.
Objects for Embedded Systems, Addison-Wesley, Reading, Mass., 1999. ITU Recommendation Z.120: Message Sequence Charts (MSC), ITU General Secretariat, Geneva, 1992. ITU Recommendation Z.100: Specification and Description (SDL), ITU General Secretariat, Geneva, 1992. K. Verschaeve and A. Ek, “Three Scenarios for Combining UML and SDL 96,” SDL’99: The Next Millennium, R. Dssouli, G. Bochmann, Y. Lasav, eds., Elsevier Science, Netherlands, 1999, pp. 209–224. M. Diefenbruch, J. Hintelmann, and B. Müller-Clostermann, “QSDL: Language and Tools for Performance Analysis of SDL Systems,” Proc. 5th GI/ITG Specialist Discussion on Formal Description Techniques for Distributed Systems, 1995. P. Leblanc, SDL Performance Analysis, Verilog White Paper, Toulouse, France, 1998; www.csverilog.com/ download/geode.htm (current Nov. 2001). W. Dulz, “Performance Evaluation of SDL/MSC Specific Protocol Systems,” Proc. Software Technology in Automation and Communication, Springer-Verlag, Berlin, 1996. A. Schmietendorf, E. Dimitrov, and K. Atanassov, “Model-Based Analysis of an EJB Environment with Generalized Nets,” Proc. 17th UK Performance Eng. Workshop, Univ. of Leeds, UK, 2001, pp. 1–12.
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.
About the Authors Evgeni Dimitrov is a member of the Information Technology Department at Deutsche Telekom AG, Berlin. His research interests include object-oriented technologies such as the implementation of a generic process model and strategic orientation for intelligent network applications. He has an MS in mathematics and a PhD for his conception and development of a netbased simulation system, both from Humboldt University, Berlin. Contact him at T-Nova Deutsche Telekom Innovationsgesellschaft mbH, Entwicklungszentrum Berlin, Wittestraße 30 N, 13509 Berlin; [email protected]. Andreas Schmietendorf works as a consultant for system and software development in the Information Technology Department at Deutsche Telekom AG. His main research interests include performance engineering, component software development, and software measurement. He has an MS and a PhD in computer science from the University of Magdeburg. He is an active member in the German Society of Computer Science and the CECMG Central Europe Computer Measurement Group. Contact him at T-Nova Deutsche Telekom Innovationsgesellschaft mbH, Entwicklungszentrum Berlin, Wittestraße 30 N, 13509 Berlin; a.schmietendorf@ telekom.de. Reiner Dumke is a professor of software engineering at the University of Magdeburg. His research interests include software measurement, CASE tools, OO development methods, and distributed systems. He is a member of the IEEE and ACM, a founder of the Software Measurement Laboratory at the University of Magdeburg, and coeditor of the Metrics News Journal. Contact him at Otto-von-Guericke-Universität Magdeburg, Institut für verteilte Systeme Universitätsplatz 2, 39106 Magdeburg; [email protected].
January/February 2002
IEEE SOFTWARE
83
manager Editor: Donald J. Reifer
■
Reifer Consultants
■
[email protected]
Continuous Process Improvement and the Risk to Information Assurance George E. Kalb and Gerald M. Masson
I
n today’s climate, every good manager should pay attention to the technology gap developing between the deployment of information assurance (IA) products and the technical capabilities of exploiters who can successfully attack an enterprise’s information assets. The growing number of would-be hackers eager to infiltrate a network-based computer system rep-
resents an ever-present threat to managers concerned with protecting the information assets those same computer systems host. Managers can counter this threat by acquiring and deploying protection technology within an environment that involves, at best, an incremental improvement process. However, the exploiter of the computer-based system works within a continuous process improvement environment. The difference between the exploiter’s and the manager’s environments creates an IA gap that leaves information assets vulnerable and therefore at risk. Exploiter’s environment Exploiters work in an environment of continuous and almost instantaneous process im84
IEEE SOFTWARE
January/February 2002
provement that increases their knowledge, experience, and overall capabilities to attack computer-based systems. When targeting a computer system, the exploiter must determine an attack scenario that offers the greatest probability of success. Here, success involves a measured balance between successfully penetrating the system, achieving intrusion goals, and avoiding detection. Internet hacking sites and bulletin boards abound that enable easy access to the latest information resources that describe various attack scenarios directly applicable to the exploiter’s chosen computer system target. These sites offer the exploiter training and lessons learned from literally thousands of fellow exploiters. Our exploiter thus has access to up-to-date information taught by these thousands of virtual mentors without the delays typically needed to prepare courses or textbooks to bring academic materials into the traditional classroom. These hacking sites also offer readily available tools developed by other, probably more experienced, exploiters. Even without these tools, the exploiter might possess the knowledge to rapidly develop a suitable tool tailored to the selected attack scenario. Developing this tool would not involve burdensome software development that emphasizes processes to ensure product longevity. If the newly developed exploitation tool lets users achieve the end goal, its creators will likely post it for distribution. Other would-be exploiters could use it either intact or after further enhancements, fostering a continuous product enhancement and deployment cycle that rapidly 0740-7459/02/$17.00 © 2002 IEEE
MANAGER
Threat level
Threat level
distributes the latest in tool capabilities to the exploitation community. The availability of information resources and tools on the Internet fosters a more continuous than incremental technology refresh rate. The sheer IA technology gap es iti number of individual exploiters, workbil a p Version (n + 3) ca ing alone or in concert, also creates an 's r e it environment of continuous process imVersion (n + 2) plo Ex provement toward superior capabiliVersion (n + 1) ties. This is in contrast to the incremental improvement we would expect Version (n) had we limited our observations to a small sampling of exploiters targeting a single host platform. In the large, the Time novice script kiddies quickly absorb material posted by the more seasoned hackers who, in turn, given their large Figure 1. IA technology gap caused by protection tool development. numbers, are developing new tools and techniques against the broadest span of systems. The overall effect is a continu- ness level. At any given instant, an IA might also mandate an exhaustive ous increase in the average exploiter’s technology gap is present between a trade study of available protective capabilities. protective tool’s capabilities on the tools to justify procurement costs, furopen market and the average ex- ther delaying the installation process. Management issues ploiter’s capabilities (see Figure 1). Third, managers must quickly deManagers must deploy protective Second, managers must contend ploy the newly acquired protection technologies to counter the exploita- with the enterprise’s internal acquisi- technology across the enterprise. The tion threat. Yet these same protective tion process that introduces additional installation process might require sigtechnologies are bounded by myriad latency, delaying implementation of nificant time for larger enterprises typtime latencies that diminish their ef- relevant protection technologies (see ically having a fixed (relatively small) fectiveness when deployed. Figure 2). The enterprise might re- system security staff that must support First, managers must realize that quire prior review and approval of a an infrastructure consisting of numerany protection tool brings with it the security policy, or modification to an ous geographically separate host systool vendor’s development and de- existing policy, before changing the IA tems. Penetration testing, problem ployment latency. Protective tools are level. Internal acquisition processes identification, and troubleshooting based on requirements derived from the known state of exploitation at large. The tool vendor must have adequate time to develop, market, and ship the protection tool’s current version to prospective end users. The tool vendor might employ rapid-development techniques, increIA risk area mental requirements enhancements, s tie i l i or other techniques to close the IA b pa Version (n + 3) ca technology gap between the protec'r s ite Assurance level tion tool’s capabilities at deployment Version (n + 2) Employ (n + 2) plo x E and the current capabilities of exTrain Version (n + 1) ploiters. The tool vendor might also Install Assurance level (n + 1) supply software patches for upgradVersion Train Employ (n) ing the deployed protection tool to Install Assurance level (n) Policy Procure maintain concurrent capabilities to the exploitation threat. However, deTime veloping and deploying these upgrade patches requires time that results in, at best, more numerous and smaller Figure 2. IA risk area grows due to latency required to deploy incremental steps in the IA effective- protective technologies. January/February 2002
IEEE SOFTWARE
85
MANAGER
Mean capability
s tie
IA risk gap
Threat level
ili
b pa
's
er oit
l xp
E
Lesser capability
ca
Protective technology stagnation period
Version (n + 2)
Policy Procure
Assurance level (n + 2)
Train
Version (n + 1) Version (n )
Employ
Install Assurance level (n + 1) Train Employ Install Assurance level (n)
Time
by the figures, are an unknown, difficult-to-quantify quantity. Managers might infer this capability from CERT reports, penetration logs, or past experience protecting the enterprise’s informational assets. However, at any given instant, they cannot reduce an estimate of the average exploiter’s capability to expedient, accurate, and meaningful mathematical terms. Hence, they must deal with the upper bound of this IA risk gap as an unknown and probably moving target, further exacerbating their ability to cost-justify the next incremental upgrade in IA technology.
Figure 3. Lack of rapid protection technology refreshment increases the IA risk gap, giving less capable exploiters opportunities to infiltrate.
can accompany the tool’s rollout, further delaying the protection this technology enhancement provides. Finally, no security solution is complete without the accompanying training of both the security administration staff and the end-user communities. Here again, the larger the enterprise, the longer it will take to ensure training coverage for all end users. Managers must deal with the time and cost of providing adequate training to personnel across the enterprise and the additional latency imposed to effectively reap the benefits. IA risk gap When added to vendor development latencies, latencies arising from the management-related issues we’ve described result in an IA gap that surpasses what existed before the protective technology deployment. However, the protective technology’s root technology base arises from initial requirements that the tool vendor specified for this specific release. Without incremental improvements released by the tool vendor and quickly installed by the enterprise, the protective technology will stagnate over time, allowing the exploiter’s capabilities to increase. The greater this period of protective technology stagnation—the longer management sustains the current protective technologies before procuring
86
IEEE SOFTWARE
January/February 2002
the next IA level improvement—the larger the resultant IA risk gap. In Figure 3, this widened IA risk gap not only shows that the average exploiter’s capabilities are well advanced relative to the installed protection technologies, but that a growing number of less competent exploiters can now successfully mount an attack against an enterprise’s information assets. Managers might now expect more numerous attacks—and many more successful attacks—on their computer systems because the pool of capable exploiters has grown. Furthermore, the average exploiter’s exact capabilities, as depicted
Without incremental improvements released by the tool vendor and quickly installed by the enterprise, the protective technology will stagnate over time, allowing the exploiter’s capabilities to increase.
B
y taking measures to diminish the IA risk gap, managers provide a better fortress for protecting the enterprise’s interests. If we could tax every suspicious ping, malicious transmission, and illegitimate script foiled through these diligent activities to help defray the cost of increasing the IA level, more numerous incremental improvements could be afforded, resulting in a smaller IA risk gap. The smaller IA risk gap affords the enterprise the maximum protection because the greatest portion of all attacks will be detected and rendered unscuccessful. However, managers might never receive information concerning the unsuccessful attacks, but only the successful ones. These successful enterprise-destroying attacks are what forces us to be attentive to closing the IA risk gap to manageable levels.
George E. Kalb instructs a foundations course in software engineering and an embedded computer security course at Johns Hopkins University. His research interests include software exploitation and protection technology, embedded computer systems, and software metrics. He has a BA in physics and chemistry from the University of Maryland and an MS in computer science from Johns Hopkins. Contact him at [email protected]. Gerald M. Masson is a professor of computer science
and director of the Information Security Institute at Johns Hopkins University. His research and teaching interests include reliable and distributed computing, as well as information security. He received a PhD in electrical engineering from Northwestern University and is a fellow of the IEEE. Contact him at masson@ cs.jhu.edu.
requirements Editor: Suzanne Robertson
■
The Atlantic Systems Guild
■
[email protected]
Not Just the Facts: What “Requirements” Mean to a Nonfiction Writer Ashton Applewhite
Software people are not trained to write, and yet we have to write all the time as part of our jobs. Specifying requirements presents some unique challenges to the software professional in his role as a writer. To learn more about how to deal with these challenges, I asked a writer, Ashton Applewhite, to tell us what she does when she writes. I found many lessons here for the requirements writer, and I hope that you find as many of them as I did. —Suzanne Robertson
M
aybe it’s in the genes. My father, a spy by vocation and cultural historian by avocation, enjoyed a second career as R. Buckminster Fuller’s amanuensis, setting down the inventor’s system of geometry in book form. I, too, explain to the public the work of people about whose disciplines I often know little. I am a generalist who writes nonfiction. Although many of the people I interview are scientists, I don’t consider myself a “science writer” (or a “business writer” or a “women’s issues writer,” for that matter, although I write on those topics as well). The craft of writing is subject-independent. Practitioners apply the same skills, or lack thereof, to an essay on George Sand or a manual for parkinggarage attendants. From a requirements point of view, the challenge in what I do is to communicate complex ideas clearly, neither oversimplifying nor getting lost in the details. In the last year, I’ve written about trends in the commercial aviation industry, reintroducing Przewalski’s horse to Mongolia, the best 0740-7459/02/$17.00 © 2002 IEEE
way to amplify an acoustic bass, the promise of interactive media, and how pulsating stars called cepheid variables help physicists measure the universe. Many of these subjects were completely new to me at the time. Figuring out what readers need to know Lack of expertise can be a liability, but it can also work to my advantage, because my readers generally share my condition. This means I don’t have to wrestle with what the editor of this column calls “the problem of unconscious requirements.” This occurs when an expert assumes that his audience knows as much as he does and so omits essential information, or omits it in an effort not to seem patronizing. I do, however, have to think hard about who and how knowledgeable my readers are likely to be and what they’re likely to be interested in learning more about. My readership and the purpose of the piece determine the initial requirements. Systems engineers often complain that, “People don’t tell me what they want.” I can certainly relate to this. Sometimes an editor will offer guidance, perhaps explaining how the article should fit into a series. Other January/February 2002
IEEE SOFTWARE
87
REQUIREMENTS DEPT TITLE
times, context does the trick; a newsletter for science teachers focusing on observation techniques cues me to pursue the nuts and bolts of that process from a pedagogical perspective. Most of the time, though, I’m in the same boat as the systems engineer. People don’t know what they want, especially when it comes to something as abstract as information. Just as in any design process, a mixture of research, experience, and intuition shapes the answer. Knowing the audience The first question is who am I writing for? Kids? Stockholders? The general public? The answer primarily governs the vocabulary and depth I go into in an interview, rather than the nature of my questions. When I interviewed Neil Tyson, director of New York City’s Hayden Planetarium, for a children’s Web site, the language had to be kept simple, but not the topics under discussion. I asked him what drew him to the field. He said that it was a suspicion that the starry dome he encountered on his first visit to the planetarium, so different from the view from his Bronx rooftop, was a hoax! I also asked what it was like to be one of the dozen or so black astrophysicists in the world. “No one invited me into the physics club,” said Tyson, “so I became the fastest guy on the block, and I don’t regret it. But I realized just how easily society enabled me to become an athlete and how much energy I had to draw on to do anything else.” Undeterred by conventional expectations, the teenager stayed focused on astrophysics. Focusing on the task at hand Second, what’s the piece’s primary purpose? Sometimes it’s to explain a specific technology, such as how geologist Elise Knittle simulates conditions in the inner Earth by placing a speck of metal in a diamond-anvil high-pressure cell—essentially a hightech nutcracker—heating it with an infrared laser, and extrapolating the results to slabs of oceanic crust thousands of kilometers across. “I need to 88
IEEE SOFTWARE
January/ February 2002
get chemical and structural information out of a sample so small it looks like a piece of pepper,” says Knittle, “and I need to figure out the best way to go about it, because most of the time these experiments don’t work.” In cases like this, I ask a zillion questions—as many as it takes to understand the process well enough to describe it clearly. This is way less humbling than having to admit in a follow-up call that I didn’t actually follow that bit about measuring the fluorescence of ruby chips. Ask more questions Of course there’s such a thing as a stupid question, kindergartners aside. But the consequences of unasked questions, even a seemingly obvious one, can be far greater. In general, specialists are not only tolerant but appreciative of my efforts to figure out just what it is they do. Details are essential, especially when describing something abstract or complex. Comparing Knittle’s tools and techniques to everyday objects like a nutcracker and grains of pepper makes the process far easier to visualize. But how do you decide how many details are necessary? Computational biologist Andrea Califano of First Genetic Trust came up with a helpful analogy when talking about predictive biology: “Today we can predict the weather on a relatively small grid, which requires a fairly ex-
The first question is who am I writing for? Kids? Stockholders? The general public? The answer primarily governs the vocabulary and depth I go into in an interview, rather than the nature of my questions.
traordinary amount of computing power. Computationally, there’s really not much difference between modeling weather and modeling the behavior of a cell.” He went on to comment that although biologists can corroborate in silico results with in vitro experiments, this is not always an option. “You don’t corroborate the weather do you?” he teases. “At some point, you have enough knowledge to trust the system that predicts it.” Some of my pieces of writing serve a specific purpose, as in a set of profiles explaining why a group of science PhDs opted for business over research careers. “In business, whether the answer is 3.5 or 3.8 is less relevant than the fact that it is around three or four and not 30 or 40,” says Ujiwal Sinha, a mechanical engineer with McKinsey & Company. “A lot of scientists are motivated by the fact that they can spend another two years on it and get precision to the 10th decimal place, whereas what excites me is the challenge in getting to an answer.” Molecular biologist Jenny Rooke, also with McKinsey, was fascinated by genetics until she had to spend three years squashing up fruit flies in a lab and coming up with “only a minuscule component of an answer. We only thought about the overall question itself maybe 10 percent of the time, and I found that deeply dissatisfying.” My tactic in understanding why these scientists went into business was to press for specific experiences and link each back to the theme of professional growth until the full story emerged. Down with jargon There’s something else that might not have anything to do with the upfront assignment: making the piece interesting. Clear language is half the battle. There’s a scientific term for what Rooke does to those fruit flies, but if I don’t know what a word means, lots of my readers won’t either. Jargon can be a short cut for the cognoscenti, especially in highly crossdisciplinary fields such as systems and requirements engineering, but it’s a stumbling block for the rest of us. Terms such as usability and informa-
REQUIREMENTS DEPT TITLE
tion technology can have many different meanings. I asked Minerva Yeung, an engineer at Intel, what information technology means to her. “My specialty in information technology deals with how we process digital signals,” she replied. “For example, what are the algorithms that allow the information—the zeros and ones that make up the information sequences—to be operated on with more efficiency and better quality? And then how should that information be organized and distributed? That’s the whole field, from software to communications.” Other people have given me completely different answers to the same question, because each answer reflects that person’s particular experience and knowledge base. There’s always a story I try to introduce personal details that bring people to life. Everyone has stories to tell, and tapping into them keeps readers reading. For instance, I also asked Minerva how she got her name, because I was curious and because I figured it wasn’t a question she is often asked by engineering periodicals. Minerva’s English high school in Hong Kong asked her for an English name, so she looked up ones that started with the same sound as her Chinese name, Ming-yee. “I knew it was the name of the goddess of wisdom and that I didn’t have sufficient wisdom at the time,” she recalled disarmingly, “but I loved the name, so I picked it.” Sometimes the story tells itself as fast as I can type; occasionally, I have to go after it with Novocain and pliers. Some people are hard to establish a rapport with, or veer off on tiresome tangents. Others are stunningly inarticulate, in which case I might suggest a word or phrase, although I never substitute mine for theirs without express permission. If people are really interested in something, whether philately or phylogenetics, whether tongue-tied or TV-trained, they can talk about it in an engaging way— maybe with a little help. That’s where homework comes in. Before an interview, I try to get my
hands on a CV, examples of my subject’s work, and a prior interview or article about the person if possible. I then use their interests and achievements as a springboard for conversation. The more specific my questions, the livelier the answers, and the more vivid and instructive the final article. “What was the most exciting discovery of your career?” is a lazy question. “How did you feel when you came across the first fossilized dinosaur skin in Patagonia?” is an engaging one more likely to produce an engaging answer. Knowing when to wing it Even more important is to know when to throw my homework out the window. In the course of an interview, I might sniff out passions and pursuits even though they seem tangential to the topic at hand. Ultimately, I trust my instincts. What am I curious about? What music does a renowned musician listen to at home? None at all in the case of vibraphonist Gary Burton: “I have an innate fear of it somehow destroying me if it were omnipresent.” Why fish? “There’s pretty much nothing that fishes don’t do,” responds ichthyologist Ian Harrison. “They don’t only swim; they walk, they fly. And they span an incredible range of habitats, from the deepest ocean trenches to Tibetan hot springs over 5 kilometers high. African lungfish can pretty much just sit in the mud for a few months or even years, and come back out when the rains return.” Who knew? I had the privilege of speaking to Nadine Gordimer on the completion of her first novel set in
The best indicators of how successfully I’ve communicated my message are the responses from editors and readers.
postapartheid South Africa, so I asked what kind of influence she thought fiction had on politics. She replied, “Newspapers and TV show the raid, the moment of crisis, children advancing against tear gas. You didn’t know what kind of lives they lived before or how they put their lives together again afterwards. That is the dimension of fiction, and certainly they underestimated its power.” Boiling it down to the essentials Requirements are tweaked again during the hardest part: turning a transcript and a pile of notes into a coherent article. What should be emphasized and what consigned to electronic oblivion? What constraints does the piece’s structure impose? Do the ideas follow logically and clearly? How short can I make it—shorter is always better, and harder—and still cover all the bases? When should narrative trump brevity? During the writing and rewriting process, I wrestle with these questions. If my deadline affords, putting the piece aside for a day or two provides a helpful distance. A good editor can help immensely, and a poor one, like an idiot client, is a maddening obstacle. I give my subjects the right to review and make changes, although I’ll advocate for the version I think best. And I’ll revise the piece as often as it takes until everyone involved is satisfied. The best indicators of how successfully I’ve communicated my message are the responses from editors and readers—getting paid and getting referrals are good signs, too. Still, mine is a highly subjective line of work. It’s readily apparent if the teapot scalds the pourer or the software is hopelessly buggy, less so if no IEEE Software subscriber reads this far. But I hope some of you have and that some of my experiences with requirements help you with yours.
Ashton Applewhite lives in New York City. She writes
part time for the American Museum of Natural History and is a contributing editor for IEEE Spectrum magazine. Contact her at [email protected]. January/February 2002
IEEE SOFTWARE
89
in the news Features Editor
■
Scott L. Andresen
■
s a n d r e s e n @ c o m p u t e r. o r g
Federal Government Calls for More Secure Software Design Events of 11 Sept. galvanize agencies, security experts, but will it matter? Greg Goth, [email protected]
C
ircumstances and good intentions may finally have combined to make a difference in the effort to design, build, deploy, and audit more secure software across government and private sector infrastructures. The terrorist attacks of 11 September brought forth pronouncements and conferences, reorganizations, and resolutions dedicated to improving the security of networks large and small. At a recent press conference sponsored by the FBI and the System Administration, Networking, and Security (SANS) Institute, acting chief information officer of the US Air Force John Gilligan, who also served as vicechair of the Federal CIO Council’s security committee, publicly called for more stringent quality measures by software manufacturers. “I would like to take this opportunity to call on our nation’s Internet and software industry, the industry that has helped fuel the tremendous prosperity that we have enjoyed over the past 10 years, to take a new approach to the design and fielding of their products,” Gilligan said. “In short, none of us can afford the cost of a continual race against would-be cyberattackers using the current ‘find and patch’ approach to deal with latent vulnerabilities in commercial software packages. “It is clear that the quality of software design and testing in the past does not measure
Software design and testing does not measure up to the needs of the present and future.
90
IEEE SOFTWARE
January/February 2002
up to the needs of the present and future. I challenge the leaders in the software industry, especially in the wake of the physical attacks on this nation, to work together to establish new standards of software quality, as well as effective methods to reduce the impact of current vulnerabilities.” However, experts disagree on whether Gilligan’s words were a once-and-for-all clarion call to the software industry to improve its practices or merely a well-meaning BB across the bow. “They’ve been talking about this for years,” says veteran cryptographer Bruce Schneier, chief technology officer of Counterpane Internet Security. “Why should this time be any different? In a lot of ways this could be a good idea. Done wrong, it could be a terrible idea.” Alan Paller, director of research at the SANS Institute, says the 11 September attacks have modified the equation substantially, however. Paller says network security is no longer perceived as an add-on. “If we hadn’t had September 11, I would have told you it was going to be a slog, a long, slow, unpleasant trip,” Paller says. “But now it’s a very rapid trip.” Paller says the terrorist attacks, and the resulting crippling of some companies’ networks, have injected a new sense of urgency into corporate and government security measures. Early adopters will incorporate tools and monitoring into their systems within five to seven months, he believes. Consequently, they will demand software vendors supply more secure products.
0740-7459/02/$17.00 © 2002 IEEE
IN THE NEWS
“It’ll go a little more slowly for those who aren’t early adopters, but it’s no longer acceptable to shrug off these questions with an ‘I don’t know when it’s going to happen’ response,” Paller says. Many questions remain Behind the good intentions and flurry of activity, however, questions remain about how new standards and tools to measure them might be employed. Just a few of those questions include: ■
■
■
■
■
Who will be involved in devising and monitoring these new software security standards, if they are adopted? How would these standards differ from various benchmarks already in place, such as the Defense Department’s Orange Book, the Common Criteria Evaluation, and international standards such as BS 7799 (a British standard since adopted by the ISO)? How will the private sector and governmental agencies at various levels be able to provide input? At which point will recognized standards bodies such as the IEEE, IETF, and ITU be involved? What kind of enforcement mechanisms will penalize manufacturers of substandard software?
“The realities are these,” says Jonathan M. Smith, a University of Pennsylvania researcher who recently received a $2 million grant from the Defense Advanced Research Projects Agency (DARPA) to explore ways to integrate security features from the OpenBSD system into mainstream office workstations. “The problem you have is that people have incentives that may push them away from cooperation. This is always a problem. Standards always benefit someone, so when people are making money they look at standards and say, ‘They could make me money or they could hurt me.’ So you have some potential disincentive to standardization.” In fact, Smith says the history of industry’s response to standards-mak-
ing efforts is checkered at best. “There are people who did have benchmarks,” he says, “this almost quantitative, checklist type called the Common Criteria. The issue has been for industry, it’s very expensive to go through these kinds of tests. They’re very rigorous and involved, and because of that, expensive to test. And if you don’t really see a market there, it’s tough to get yourself motivated to do something that costs you a lot of money for the good of the nation. That doesn’t seem to be the sentiment that drives a lot of folks.” Stopping the buck Thus far, the portion of the federal government dedicated to promoting and protecting Internet security has been directing its software security effort at reorganization. While the CIO Council’s Gilligan issued the public challenge to software makers, the Council, which operates under the General Services Administration, is undergoing its own reorganization and the call for new standards will not be under its scrutiny. Nor will the effort to encourage a standards-making effort fall under the wing of the National Institute of Standards and Technology, according to NIST sources. A
How then—if the office calling for standards isn’t responsible, the office in charge of maintaining standards isn’t responsible, and the office given official responsibility isn’t ready—will the new effort have any hope at all?
General Services Administration spokesman said the effort would be coordinated through the office of Richard Clarke, the president’s special advisor for cyberspace security. Clarke’s office was moved in October from the National Security Council to the newly created Office of Homeland Security. Currently, the office has an extremely small staff and has made it clear they are in no position to be dictating standards any time soon. How then—if the office calling for standards isn’t responsible, the office in charge of maintaining standards isn’t responsible, and the office given official responsibility isn’t ready—will the new effort have any hope at all? Perhaps the brightest hopes lay in a combination of incremental efforts that include government grants to researchers such as Smith, increased input from state CIOs with greater knowledge of the software industry in their own jurisdictions, and public-private consortia providing comprehensive nonproprietary tools and benchmarks. Once such organization is the Center for Internet Security, a Nashvillebased entity that has grown to 180 members worldwide since it was founded in October 2000. The center’s mission is simple, according to CIS president Clint Kreitner. “Our purpose is to create and propagate globally accepted detailed security standards, starting at the operating system level and working up through firewalls and databases,” he says. “Along with that, providing scoring tools which we believe is the genesis of a whole new genre of software, tools that enable an organization to constantly monitor their security status.” The center’s collaborative approach has already yielded benchmarks and scoring tools for Solaris and Windows 2000. These benchmarks, available for free download, are called CIS Level I, the prudent level of minimum due care for OS security. Kreitner says the effort that launched these benchmarks arose from a conversation CIS chairman Frank Reeder held with other IT inJanuary/February 2002
IEEE SOFTWARE
91
IN THE NEWS
other centers for engineering quality improvement. “As long as the vendors sell buggy software whose security features are disabled, we’ll start off behind the eight ball,” he says. “The idea of getting some standards that are easy to do and get a consensus, we’ll be able to push back against the vendors, to tell them, ‘Don’t ship your systems wide open. Ship them with at least CIS minimum security benchmarks installed as the default,’ so at least users will have the minimum level of security, and they’ll be able to resolve conflicts with the applications from there.” The SANS Institute is a founding member of the CIS. Paller sees the effort as a viable alternative to available proprietary network evaluation products and services. In the era Paller sees looming, in which network security is an everyday concern of corporate and governmental managers at the highest levels, the CIS toolset might become the de facto benchmark. “I think Clint’s tools will be a common denominator,” Paller says. “I hope they won’t be the lowest common denominator.” dustry leaders in August 2000. Reeder was previously director of the Office of Administration in the Executive Office of the President, responsible for information technology and telecommunications, as well as chief of information policy for the US Office of Management and Budget. “They said when it comes to security, we need not only standards such as BS 7799 and the Information Systems Audit and Control Association’s COBIT policy and process guidelines, but we need to get down in the weeds and deal with technical security settings,” Kreitner says. “The center is nothing more than a mechanism for focusing people on these various platforms, then facilitating the consensus process.” Vendors are not eligible to join the CIS, Kreitner says. This ensures a neutral evaluation process, which includes draft versions written by members, followed by member and public comment, and a continuous update effort. “We do engage them 92
IEEE SOFTWARE
January/February 2002
as members of our vendor advisory council. We give them the opportunity to comment on our drafts during the development process.” Kreitner says the CIS is also in constant contact with people at the Software Engineering Institute and
The role of the states might be a wild card in the software development and auditing process.
Time for a chat While the CIS effort focuses on improving operating system security and disseminating the benchmarks as widely as possible, DARPA is taking a micro approach to the problem. Smith’s project is one of 12 sponsored by the agency under the umbrella name of Composable High Assurance Trusted Systems (Chats), which will strive to incorporate security improvements perfected through the open source model into the public infrastructure, and perhaps into the private sector as well. Smith says the impetus for Chats came through years of failed policy in which the government invested in improving mainstream technologies that were already in wide use. “They’d end up with what I call a TOAD, or technically obsolete at delivery,” Smith says. “Under that philosophy, you end up with a secure two-year-old version of software, and that’s got problems.”
IN THE NEWS
The new DARPA approach might not only serve to improve security on open source Web servers and operating systems such as Apache and Linux. Smith believes the work he and his Chats colleagues perform, using the open source model as a de facto standard-setting effort in which the best ideas survive, will filter into the commercial realm. “It’s actually not a bad thing, even for the commercial operating system folks. Under the BSD licensing style, the software is available to them, too. And I’ve at least heard discussion on mailing lists that Microsoft has no hesitation about using the Free BSD TCP/IP stack. I don’t have any issue with Microsoft if they want to take the code and use it. The main thing is, let’s get going with the intellectual process of refining code and making it secure, and auditing it lets many, many people understand it. A rising tide lifts all boats. In the marketplace, the best idea may not win immediately, but over time it generally does. So if the way that idea gets absorbed is a licensing scheme that lets Microsoft use chunks of OpenBSD, then bravo. Those guys pay taxes and should be able to reap the benefits of this as much as any American taxpayer.” States’ rights The role of the states might be a wild card in the software development and auditing process. The National Association of State CIOs (NASCIO), at its November conference, discussed some unifying principles NASCIO president Rock Regan called revolutionary for the organization. A unified framework policy, he says, it will make it easier to govern relationships among states, between state and federal levels, and with vendors. “This was the first time the states have talked together about these issues and dealt with them on a national basis and not in 50 different ways in which we have 14 different touch points with the federal government,” says Regan, Connecticut’s CIO. “We’ve been working on a common architectural framework, and of course, part of this framework
will be security and privacy as well as application development. “We didn’t invite the vendors to this conference, and some of them were a little ticked off. Frankly, we wanted to get our act together first. Then we need to bring in the corporate side to talk about solutions to fit into the framework. We all know who the industry players are locally. I know who the players are in Connecticut. My colleagues know who they are in Utah, Georgia, and so forth. But unless we develop a common framework, we’re going to come up with 50 different solutions.” Should the NASCIO effort succeed, it might be a boon for software makers. A uniform set of state guidelines, compatible with the interfaces necessary for state-federal data exchange, would present a well-understood benchmark and increased potential market. Amid the confusion of federal reorganization, it might also help establish a precedent. “The states can move much quicker than the federal government in these issues,” Regan says, “not just in IT, but in many different programs, whether it’s welfare reform, unemployment reform, or job creation. We were able to tell some of the federal folks involved in that conference where some of the disconnects have been. I think to a large part, some of them were surprised to learn states
There is no need to rush in a set of benchmarks just for the sense of assuaging a sense of post-11 September urgency, according to Smith.
Useful URLs Center for Internet Security: www.cisecurity.org National Association of State CIOs: www.NASCIO.org DARPA’s CHAT project: www.DARPA.mil/ito/research/chats SANS Institute: www.sans.org
don’t necessarily feel we have the proper input or proper role.” Regan is careful to say NASCIO will leave the coordination of standards to Clarke’s office, a role administration officials express willingness to tackle. Taking a collective deep breath There is no need to rush in a set of benchmarks just for the sense of assuaging a sense of post-11 September urgency, according to Smith. “Things done in a panic end up looking like they were done in a panic,” he says. “We don’t want to look at this when we’re hotheaded or emotional. When you talk about security, what do you mean? It depends on the context.” The SANS Institute’s Paller agrees. Gilligan’s call for action wasn’t asking for the impossible tomorrow, Paller says. “He was explicitly telling them, ‘We’re not asking you to create perfect code.’ Nobody can do that. He was telling them, ‘Create good code, don’t sell it to us broken. Sell us the latest configuration in every case and then keep it up to date for us.’ ” In fact, he says, Gilligan’s remarks might have been the necessary catalyst for the dialogue now beginning to emerge across private and public sectors. Such dialogue, he says, is bound to help vendors make better software, bound to help administrators keep better tabs on their networks, and bound to help public officials fine-tune the January/February 2002
IEEE SOFTWARE
93
IN THE NEWS
lines of communication. Paller says such dialogue has already begun, in which public sector IT executives are negotiating more secure configurations from key vendors right out of the box.
“It doesn’t make sense to pay a thousand guys to configure a wide open system when you can pay the vendor once to configure it,” Paller says. “It’ll be a contractual issue. Better security will have to be arrived at
by contract, not fiat.” “We just want to do our part,” CIS president Kreitner says. “We don’t care who gets the credit, we just want to get the job done, to make the Internet a safer place for everybody.”
Secure with Cyclone language Cornell University researchers and AT&T Labs are working on a more secure operating system designed to avoid unforeseen programming errors that could prevent many security breaches. The language, called Cyclone, is a redesigned version of the programming language C. “Our ultimate goal is to have something as humongous as the Linux operating system built in Cyclone,” said Cornell University researcher Greg Morrisett to NewScientist.com. Cyclone’s data representation and calling conventions, which are interoperable with C-like programming styles, should simplify porting code to Cyclone, according to the developers. Cyclone prevents programmers from introducing bugs when writing code by enforcing checks through a type library. The Cyclone compiler identifies segments of code that could cause problems, using a
“type-checking engine.” It doesn’t look for specific strings of code, but analyzes the code’s purpose and singles out conflicts known to be potentially dangerous. Additionally, data type information and runtime checks help prevent array boundary violations—unchecked buffers or buffer overrun conditions. The Cyclone compiler will rewrite the code or suggest fixes to avoid potential bugs. Even if a bug still occurs, the compiled system will lead the program to halt safely, not crash. According to the team’s Web site (www.research.att. com/projects/cyclone), Cyclone also offers other programming features, such as tagged unions, parametric polymorphism, pattern matching, a lexer generator, a parser generator, and function-level debugging with tools such as gdb, and profiling with gprof.
Off-the-shelf software powers International Space Station troubleshooter Using commercially available hardware and software, astronauts can troubleshoot a variety of subsystems on the International Space Station during construction and even issue commands to sensitive avionics equipment. During one mission last year, astronauts used a laptop to test gyroscope spin motors—used to maintain the station’s proper orientation relative to Earth—and control the gyroscope’s heaters. Using off-the-shelf equipment makes sense to Rick Alena, a computer engineer at NASA’s Ames Research Center who developed the laptop computer diagnostic tool, called Databus Analysis Tool (DAT). “We can use commercial computer systems to support mission and payload operations in space flight because they have the performance required and run a large range of software,” Alena says. The DAT includes a computer card and software that can monitor status and command messages sent between onboard control computers and major space station subsystems, including solar arrays, docking ports, and gyroscopes. It allows onboard monitoring of the com-
puter control network that ties the avionics components together. The DAT team currently uses SBS Technology’s Pass 1000 software, Alena says, “but there are other systems out there that are equally capable.” Acording to Alena, adapting commercial systems for use in space is the basic accomplishment of DAT. “We’re using the same product that is used on the ground,” he says. “The difference is making sure it meets space flight standards of safety, reliability, and quality assurance.” That’s the same SRQA that ground-pounding engineers worry about, Alena says. “We assess electrical safety, offgassing and flammability. The ramifications of a safety failure increase dramatically when you use a system aboard the space station.” Reliability gets a hard look as well, Alena says, because the space environment can be tough on computer systems. “There are thermal characteristics and higher radiation levels that can adversely affect electronics. We have to validate any device that’s going to be used in space.”
Send news leads to Scott Andresen at [email protected] 94
IEEE SOFTWARE
January/February 2002