Contents at a Glance Introduction 1 2 3 4 5 6 7 8 9 10
Using
SOAP John Paul Mueller
201 W. 103rd Street Indianapolis, Indiana 46290
An Overview of SOAP 11 SOAP in Theory 33 An Overview of Security Issues for SOAP 61 Using SOAP to Create a Simple Application 81 Migrating an Application from DCOM to SOAP 121 Creating Remote Access Utilities 161 Creating Data Entry Forms and Surveys 195 Providing Remote Database Access 237 Moving to Web-based Applications 283 Working with PDAs 321
Appendixes A B C D
SOAP Data Types and Data Type Conversions 345 Microsoft Biztalk and SOAP 355 Third-Party Tool Reference 371 SOAP for Visual C++ Developers 389 Glossary 407 Index 427
Associate Publisher Dean Miller
Special Edition Using SOAP Copyright 2002 by Que All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liability is assumed with respect to the use of the information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions. Nor is any liability assumed for damages resulting from the use of the information contained herein. International Standard Book Number: 0-7897-2566-5 Library of Congress Catalog Card Number: 2001090465 Printed in the United States of America
03
02
01
4
3
2
Acquisitions Editor Michelle Newcomb Development Editors Sarah Robbins Maureen McDaniel Managing Editor Thomas F. Hayes Project Editor Karen S. Shields Copy Editors Karen Gill Kay Hoskin Indexer Johnna Dinse Proofreader Marcia Deboy
First Printing: September, 2001 04
Executive Editor Candace Hall
1
Trademarks All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. Que cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark. Warning and Disclaimer Every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied. The information provided is on an “as is” basis. The author and the publisher shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book.
Technical Editor Russ Mullen Team Coordinator Cindy Teeters Media Developer Michael Hunter Interior Designer Ruth Harvey Cover Designers Dan Armstrong Ruth Harvey Page Layout Tricia Bronkella Rebecca Harmon Cheryl Lynch
Contents Introduction 1
Working with the Web Service Description Language (WSDL) 48 Understanding Universal Discovery, Description, and Integration (UDDI) 49
1 An Overview of SOAP 11 What Is SOAP?
13
How SOAP Differs from DCOM and CORBA 15 DCOM Wire Protocol 15 CORBA IIOP 19 Java RMI 20 SOAP, HTTP, and XML
Putting Everything Together
Using SOAP to Move Data 52 Current SOAP Implementation Problems 53 SOAP and XML-RPC 56 SOAP and XML Protocol (XMLP) 56
22
Problems Solved by Using SOAP
24
Understanding SOAP Attachments
Performance Issues 26 Tradeoffs of Using SOAP 26 When SOAP Is Faster 27 SOAP and the Web Server
Case Study
57
58
3 An Overview of Security Issues for SOAP 61
28
Why Make the Move to SOAP?
50
28
Introduction
Case Study 29 Definition of a Problem 29 Light at the End of the Tunnel 30 Perfection, an Ongoing Process 30
2 SOAP in Theory 33 Dissecting the SOAP Message 35 Viewing the HTTP Portion of SOAP 37 Viewing the XML Portion of SOAP 38 Working with the SOAP Message
39
62
Understanding SOAP Privacy and Security Issues 63 What Are the Issues? 64 User Privacy Issues 65 Data Security Issues 66 Security Standards You Should Know About 67 HTTP Authentication Framework 71 Secure/Multipurpose Internet Mail Extensions (S/MIME) 73 Secure Socket Layer (SSL) 74
Using SOAP with Your Current Code 45
User Identification Issues 75 Using Smart Cards 75 Using Biometrics 76
Discovering SOAP Services 47 Understanding Discovery of Web Services (DISCO) 48
Where Do You Go from Here? Case Study
78
77
iv
Special Edition Using SOAP
5 Migrating an Application from DCOM to SOAP 121
4 Using SOAP to Create a Simple Application 81 Introduction
Introduction
82
An Overview of Microsoft’s SOAP Toolkit 83 Toolkit Features 84 Three Types of Microsoft SOAP Toolkit Application 86 Problems You’ll Experience 87 Using the Microsoft SOAP Toolkit 90 An Overview of the Application 98 How SOAP Applications Differ 98 Basic Application Design and Data Flow 99 Understanding the Simple Application in This Chapter 100 Shortcuts for Creating SOAP Applications Quickly 100 Understanding Namespaces, the Short Version 102 Creating the Server Side Code 104 Designing the Component 104 Designing a Listener 105 Creating the Client Code
106
Testing the SOAP Application 109 Checking the Example Application for Errors 110 SOAP Testing Tips 112 Testing Your Server 113 Handling SOAP Errors
116
Performance Concerns for all Applications 117 Attributes Versus Elements Code Optimization 119 Project
120
118
122
SOAP Application Conversion Prerequisites 123 An Overview of the Conversion Process 123 Deciding Which Application Modules to Change 126 Avoiding Protocol-Related Problems in Modified Applications 129 Integrating New Modules with Existing Application Elements 134 Implementing SOAP with COM Language Binding 136 Productivity Tips 137 Updating a Simple Utility Program 140 Updating the Server-Side Component 141 Creating the Local Component 142 Updating the Standard Client 144 Updating a Data Viewer 145 Understanding the Importance of Three-Tier Programming 146 Updating the Common Business Logic Component 147 Separating the Data Viewing Logic from the Main Database Component 148 Creating the Data Viewer Client 151 Updating a Complete Database Application 153 Modified Application Concerns Reliability 156 Security 157 Performance 158 Troubleshooting
159
155
Contents
6 Creating Remote Access Utilities 161 Introduction
162
An Overview of Remote Access Utilities 162 Uses for Remote Access Utility Applications 163 Understanding the Web Services Difference 165 When Is SOAP Overkill? 166 Making Utility Programs Flexible 167 Shortcuts for Using Existing Components with Utilities 168 Utility Program Security Issues 169 Non-Issues for Utility Programs 170 Writing a Server Status Viewer 172 Creating the Server-Side Component 173 Tips for Working with Server Status Information 177 Working with the ISAPI Listener 179 Creating the Client 181 Creating a Simple Employee Check-In Application 182 Creating the Component 183 Some Caveats About WSDL Files 188 Creating the Client 190 Testing the Application 191 Project 193 Creating Your Own Utility 193 Upgrading the Server State and User Check-In Utilities 194
7 Creating Data Entry Forms and Surveys 195
Empty and NULL Value Processing 202 Unregistering Your Control 203 Understanding CDATA Sections 204 Understanding XML Document Transmission Restrictions 204 Using a Third-Party Product to Document Your Components 206 Creating a Simple Survey Form 208 Creating the Database 210 Creating the Server-Side Component 210 Designing a Survey Form 212 Testing the Input Application 214 Writing an Analysis Component 216 Designing an Output Application 218 Survey Application Data Processing Concerns 221 Testing the Output 221 Creating a Simple Data Entry Form Application 221 Creating the Database 223 Designing the Server-Side Component 223 Creating a Data Entry Client 225 Testing the Data Entry Application 227 Security, Privacy, Performance, and Reliability Issues 227 Security 228 Privacy 229 Performance 230 Reliability 231 Handling Data Entry and Survey Form Errors 232 Using Templates for Quick Forms
Determining Which Data Entry Vehicle to Use 197 Shortcuts for Data Entry and Survey Applications 199 Reducing the Number of Round Trips 199 Choosing Document or RPC Style WSDL Files 200
233
Using MIME for SOAP Applications 234 Special Considerations for MIME Messages 234 Where MIME Support Is Going 235 Project
236
v
vi
Special Edition Using SOAP
8 Providing Remote Database Access 237 Remote Database Application Uses and Concerns 239 Concerns 240 Common Uses Based on SOAP Limitations 242 Developer Shortcuts for Remote Database Applications 243
9 Moving to Web-Based Applications 283 Uses for Web-Based Applications
Using Complex Data Types 245 Understanding the Problem 245 Understanding WSDL Generator Differences 247 Describing the Interface 249 Creating the Server-Side Component 251 Creating the Remote Client 252 Testing the Complete Data Type Application 254 Defining the SQL Server Database
Troubleshooting 279 How Do I Detect SOAP-Related Database Errors? 279 Are the Database Shortcomings of SOAP Permanent? 280 What Are the Top Ten Issues for SOAP Database Developers? 280
285
Overcoming Problems with Web-Based Applications 287
255
Creating the Server-Side Component 255 Tips for Working with Database Components 256 Working with SQLXML 257 Using Multiple Server-Side Components 258 Generating the Code 259
Updating a Thick Client Application for Thin Client Use 289 Generating Proxy Classes the Easy Way 290 Using Thick and Thin Clients Simultaneously 291 Creating the Server-Side Component 292 Creating the Processing Component 295 Designing the Thick Client Application 297 Designing a Thin Client Form View 298 Designing the Thin Client Web Page 299
Creating a Middle-Tier Component
263
Creating a Live Data Application
Creating the Client-Side Application
267
Handling Web-Based Application Errors 303 Handling Connection Loss 304 Scripting Errors 304 Component Security Problems 305 Service Is in Use 305 ActiveX Component Can’t Create Object 306 Nothing Happens or Strange Error Message 307
Testing the Application
272
Quick Fixes for Remote Database Applications 274 Loss of Connection 275 Odd Data Entry Errors 275 Performance Issues 276 Server Is Busy or Missing Objects Addressing Transaction Issues
278
277
302
Contents
Security Issues for Web-Based Applications 307 An Overview of the Potential Security Solutions 309 Using SSL 311 Using IBM Web Services Toolkit 311 Quick Fixes for Memory and Other Resource Problems 312 Component Interactions 313 Benefits of the Script Debugger 314 Human Language Support 316 ASP and Component Communication 316 Component Doesn’t Support Locale Error 317 Case Study 317 An Overview of the Problem 317 The Solution 318 The SOAP Connection 319 A Negative to Consider 319
10 Working with PDAs 321 Special Needs for PDAs 323 The Case for PDAs 323 Special Add-ons 324 Networking 325 Operating System 325 Getting SOAP for Your PDA pocketSOAP 327 IdooXoap 328 kSOAP 328
Server Differences Revisited 337 Testing the Application 338 Addressing PDA Display Issues Screen Size 339 Using Color 340 Pointer Pointers 340 Beyond PDAs to Telephones
341
Understanding PDA Security Issues
341
Troubleshooting 343 Why Does the Example Seem to Run, and Then Display Nothing Onscreen? 343 Why Doesn’t My Code Run Properly on All of My PDAs? 344 How Do I Fix Messaging Problems with the Client? 344
A SOAP Data Types and Data Type Conversions 345 Data Types Overview
346
Complex Data Types
349
Differences in Implementation Data Type Conversions
351
353
B Microsoft BizTalk and SOAP 355
326
What Is BizTalk?
Updating the Complex Type Example 329 Creating the Client Code 329 Differences in Implementation 331 Testing the Application 331 Updating the Computer Name Example 332 Creating the Code 333 A Look at the Message Traffic
338
357
How BizTalk and SOAP Work Together 358 An Overview of Useful BizTalk Utilities for SOAP 359 BizTalk Editor 359 BizTalk Mapper 363 BizTalk Orchestration Designer 365 BizTalk Fixes a Few SOAP Problems
335
Is BizTalk the SOAP Add-On for Your Company? 369
368
vii
viii
Special Edition Using SOAP
C Third-Party Tool Reference 371
An Overview of the Application
Finding the Right Tools 372 Tools That Make You Productive Tools That Fix Problems 373 Masker 2.0
374
MZTools Add-In for Visual Basic psWSDL Wizard tcpTrace
372
376
381
383
XML Spy 384 Typical File Views 385 Viewing Data in More Than One Way 386 Special Features 388
Creating the Server-Side Component 395 Generating a Static WSDL File Developing the Server-Side Component Code 396 Making the CONFIG.XML Entry 399 Creating the Client 400 Initial Setup 400 Adding the Code 401 Handling SOAP Errors
Glossary 407 Index 427
D SOAP for Visual C++ Developers 389 Introduction
390
An Overview of the 4S4C SOAP Toolkit 391 Features 391 Installation 393
394
404
395
About the Author John Mueller is a freelance author and technical editor. He has writing in his blood, having produced more than 50 books and more than 200 articles to date. The topics range from networking to artificial intelligence and from database management to heads-down programming. Some of his current books include several COM+ developer guides, a small business and home office networking guide, and a Windows 2000 Performance, Tuning, and Optimization book. His technical writing skills have helped more than 25 authors refine the content of their manuscripts. John has provided technical editing services to both Data Based Advisor and Coast Compute magazines. He has also contributed articles to magazines like SQL Server Professional, Visual C++ Developer, and Visual Basic Developer. When John isn’t working at the computer, you can find him in his workshop. He’s an avid woodworker and candle maker. On any given afternoon, you can find him working at a lathe or putting the finishing touches on a bookcase. One of his newest craft projects is glycerin soap making, which comes in pretty handy for gift baskets. You can reach John on the Internet at
[email protected]. John is also setting up a Web site at http://www.mwt.net/ ~jmueller/. Feel free to take a look and make suggestions on how he can improve it. One of his current projects is creating book FAQ sheets that should help you find the book information you need much faster.
Dedication This is my 50th book, so I spent time considering everything that has gone on during the past 13 years. So many people have helped me that listing them here would consume many pages. Authors require help from those around them. They not only require technical help, but also help with other issues such as the daily needs that we all have. I dedicate this book to all of those tireless helpers who have meant so much to me over the years. I wish that I could list all of you on this page.
Acknowledgments Thanks to my wife, Rebecca, for working with me to get this book completed. I really don't know what I would have done without her help in proofreading my rough draft. She also helped research, compile, and edit some of the information that appears in this book. Russ Mullen deserves thanks for his technical edit of this book. Russ greatly added to the accuracy and depth of the material you see here. In addition, he spent many hours working with me through e-mail to find solutions to some of the problems that SOAP presented, and he's responsible for providing many of the Web-site addresses sprinkled throughout the book. Russ went even further in this book; he spent hours finding just the right equipment to use when performing his technical edit. I can never sufficiently express my thanks to him. As you read this book, you'll see mentions of many third-party products and in-depth looks at products from major vendors. The SOAP community provided much of this information through conversation and by checking the technical accuracy of both code and theory. I'd like to thank everyone, but space won't allow me to mention them all. However, I would like to mention that the people on the Developmentor list server and the Microsoft SOAP newsgroups were exceptionally helpful. Some special people stand above the rest. Simon Fell helped me in many areas, including the complex type and PDA examples. He also helped with the products he develops and supports, including 4S4C and pocketSOAP. Roger Wolter helped in many of the Microsoft SOAP Toolkit examples. He read and commented on many of the theory sections of the book. Paul Kulchenko discussed MIME and SOAP attachments with me, among other topics. Karen Watterson provided several tips for the book and pointed me toward some of the third-party products. Rosimildo DaSilva taught me about SOAP and embedded systems. Yasser Shohoud helped me understand the problems with user-defined type support in SOAP toolkits. Birgit Reidinger helped me with XML Spy and discussed some compatibility issues with me. Ahmad Baitalmal provided me with a glimpse of a completely functional SOAP application that appears as one of the case studies in this book. He reviewed and helped me refine the case study so that you could read about a real-world application in action. Jacek Kopecky provided me with information about IdooXoap and discussed PDA issues with me.
Every book begins with a good outline and proposal. I would like to thank the following reviewers for their help in getting this book off to a good start: Clemens Vasters, Jeremiah Talkar, Ken Rabold, Madhu Govinda, Rob Caron, and Alek Aslom. All of these reviewers provided solid comments that affected the final content of the book and my approach in discussing certain topics. Matt Wagner, my agent, deserves credit for helping me get the contract in the first place and taking care of all the details that most authors don't really think about. I always appreciate his help. It's good to know that someone wants to help. Finally, I would like to thank Candace Hall, Michelle Newcomb, Sarah Robbins, Karen Shields, Maureen McDaniel, and other members of the Que staff for their assistance in bringing this book to print. Writing a book about SOAP presented many logistical challenges, and I appreciate their willingness to give me the time required to put a good book together. Que was also kind enough to provide me with a Palm to use in the PDA section of this chapter.
Tell Us What You Think! As the reader of this book, you are our most important critic and commentator. We value your opinion and want to know what we're doing right, what we could do better, what areas you'd like to see us publish in, and any other words of wisdom you're willing to pass our way. As an associate publisher for Que, I welcome your comments. You can fax, email, or write me directly to let me know what you did or didn't like about this book—as well as what we can do to make our books stronger. Please note that I cannot help you with technical problems related to the topic of this book, and that due to the high volume of mail I receive, I might not be able to reply to every message. When you write, please be sure to include this book’s title and author as well as your name and phone or fax number. I will carefully review your comments and share them with the author and editors who worked on the book. Fax:
317-581-4666
Email:
[email protected]
Mail:
Associate Publisher Que 201 West 103rd Street Indianapolis, IN 46290 USA
INTRODUCTION
In this chapter What's in This Book On the Web
3
6
Intended Audience
7
Equipment Used for This Book
7
Conventions Used in This Book
8
2
Introduction
It is difficult to get some types of applications to work on the Internet because of the lack of technologies that can move data from one point to another safely and with few technical problems. In addition, developers need a single standard for transferring data that will work on all operating system platforms. The need for a single solution for data transfer is becoming apparent as financial institutions begin moving their operations to the Internet. Recently, credit card companies began actively promoting their Web site as a better alternative for customer support than calling on the telephone. Many trade journals have also had stories of major financial institutions making the move to the Internet. DCOM and CORBA, the two most popular technologies currently in use, have a myriad of technical problems, including lack of operating system platform independence. The fact that many Web servers with firewalls won’t work with either technology only complicates matters. In addition, the use of two technologies means that developers end up working twice as hard to get their distributed applications to work. Clearly, developers require a new data transfer technology, and I believe that technology will be the Simple Object Access Protocol (SOAP). SOAP is based on XML and allows data transfer through firewalls with a minimum of problems. Because SOAP is XML-based, developers find that it suffers few platform-related problems. (Of course, compatibility is always an issue with a new technology.) In short, SOAP is the technology that you need to solve your distributed application programming problems. It will allow you to transfer data between disparate machines more efficiently and with fewer problems. SOAP is such a new technology that most developers have to rely on the white papers produced by developers of the technology for reference purposes. Special Edition Using SOAP provides the kind of information that you’ll need to get your applications working quickly after you understand the basic theory behind SOAP. This book will help you gain proficiency in SOAP in the following ways: ■
A technology overview that cuts out superfluous information and provides just the information needed to write applications
■
Detailed examples that show how to create new applications as well as convert existing ones
■
Tips that show how to become proficient with SOAP quickly
■
Solutions for problems with using SOAP in certain environments
■
Performance and other tips that will make applications perform faster and more reliably
■
A data type and data conversion reference
■
An overview of Microsoft’s Biztalk technology
■
A third-party tool reference that will allow you to start working with SOAP immediately, rather than build tools first.
What’s in This Book
As you can see, this book provides complete coverage for the developer who needs to complete projects quickly. Special Edition Using SOAP focuses on getting work done, rather than on the theoretical aspects of the technology. Of course, the book covers all of the details, such as ensuring that security is in place and that the application will perform reliably. Although the focus is on speed, the book will also provide tips that ensure development speed gains aren’t overshadowed by hours of debugging later. Special Edition Using SOAP is the book you need on your shelf because it provides the real-world tools that you require to make Web-based applications a reality.
What’s in This Book I’ve designed this book to get you up and running with SOAP as quickly as possible. In fact, this book contains only three theory chapters—the remaining seven chapters focus on example code and how to accomplish work quickly. You’ll find a wealth of productivity aids and tips on how to avoid pitfalls. The following sections provide an overview of the book.
Chapter 1: An Overview of SOAP This chapter tells you what SOAP is and how it differs from older technologies, such as DCOM and CORBA. It also talks about how SOAP uses HTTP and XML with some additional elements to create a message. Finally, we’ll talk about the advantages and disadvantages of using SOAP to develop applications. This includes problems you might experience, such as getting the same performance from your application as before you converted it to use SOAP.
Chapter 2: SOAP in Theory SOAP is a protocol (set of rules), rather than an application programming interface (API) or programming methodology. This chapter will help you understand how SOAP works and why it can avoid certain problems that DCOM and CORBA can’t in a distributed environment. The coverage in this section will be complete, but extremely focused. The chapter provides many references to Microsoft and other vendor material on the Internet. Even if you need a detailed view of the inner workings of SOAP, the combination theory overview in this chapter and references to white papers on the Internet should provide everything you need.
Chapter 3: An Overview of Security Issues for SOAP Security issues are a concern for any application. Protecting data is a problem that has plagued computers since the very beginning. The distributed application environment provided by the Internet only makes matters worse.
3
4
Introduction
This chapter discusses the major security concerns for any SOAP application. Later chapters will discuss specific concerns for particular application types. You’ll also find a discussion of privacy issues in this chapter. Securing the user’s identity is becoming more important as countries pass laws requiring secure and hassle-free communication. Finally, this chapter discusses some alternative technologies, such as biometrics, for securing your system. It’s important to look at these alternatives as a way to reduce coding requirements and application usage complexity.
Chapter 4: Using SOAP to Create a Simple Application This chapter contains the first complete SOAP example in the book. The example isn’t that complicated, but we do examine it thoroughly. The chapter will discuss issues such as error handling and testing techniques. We’ll also talk about methods for creating WSDL files and discuss how those WSDL files reflect the underlying component technology. This chapter concentrates on the Microsoft SOAP Toolkit.
Chapter 5: Migrating an Application from DCOM to SOAP Many developers will spend more time converting existing applications to SOAP than creating new ones. This chapter shows how to move an application from the DCOM environment to the SOAP environment. It also shows how to maximize your current coding investment by continuing to use as many of the DCOM modules as possible. We’ll also discuss some practical issues. For example, SOAP isn’t always the correct solution to your application problem; you might find that DCOM is still the right tool for the job.
Chapter 6: Creating Remote Access Utilities More employees are on the road than ever before. They want access to their applications back at the office and it’s your responsibility to provide that access. In addition, a remote user might have new application needs, such as a utility for quickly checking in at the home office without calling. This chapter looks at these issues and more. You’ll find two programming examples that show how to use SOAP for the remote user’s needs.
Chapter 7: Creating Data Entry Forms and Surveys Web-based applications are useful because you can share information with others outside your company. You can also gather information from others and use it for analysis. For example, companies are always looking for new ways to request information from customers and use it to improve sales or service. SOAP is a useful technology because it extends the data entry form and survey application to perform tasks that you can’t generally perform now. For example, such an application can completely automate the process of entering new data into the database. For that matter, you can automate the analysis as well and send the results to concerned parties.
What’s in This Book
This chapter also discusses some practical issues you need to consider, such as using templates to reduce the amount of coding required to create an application. Templates are especially important for survey applications where the presentation changes regularly. We’ll also discuss some special issues, such as the enhanced error-handling requirements for a survey application.
Chapter 8: Providing Remote Database Access The database is the center of every business. In fact, the organized storage of data is the center of every business, even those that don’t have a computer (admittedly rare today). SOAP provides new opportunities to handle data in the distributed application environment. Not only can you create new data connections internally, but also partners can now access your data directly as needed. This chapter addresses some special database applications needs. For one thing, most developers want to use transactions to ensure reliable data transfer from one point to another. We’ll discuss this issue along with methods for fixing database application problems. Although this chapter won’t provide you with a complete tutorial on database design and implementation, it does provide you with the SOAP piece of the puzzle.
Chapter 9: Moving to Web-Based Applications Several of the previous chapters have led up to this one. The application of the future will always provide a distributed connection. The user won’t care where he uses the application because it won’t matter. Web-based applications represent the future of programming technology. This chapter shows you how to create some simple Web-based applications. Of course, the important issue is preserving all of that code you have right now. This chapter also tells you how to preserve your investment by leveraging your current code base. You’ll avoid creating applications from scratch by carefully designing your Web-based application to hook into the existing application infrastructure.
Chapter 10: Working with PDAs Many users are completely hooked on the personal digital assistant (PDA). This little device holds a significant piece of their lives. It stores all of their contact information, tasks they need to perform, and schedule for upcoming weeks. Unlike a desktop computer, a PDA is small and you can take it with you anywhere to record new information. In short, PDAs are becoming an essential asset for most employees. It won’t be long before you’ll find yourself moving existing applications to the PDA—at least the applications that are small enough to move. This chapter examines methods for moving common applications to the PDA. We’ll look at two examples so that you can see some of the pitfalls in developing a PDA solution.
5
6
Introduction
This chapter also discusses some of the issues surrounding PDA development. We’ll cover SOAP-specific issues, such as the selection of a SOAP toolkit for PDA development. This chapter also discusses a few general PDA issues. For example, the small screen that a PDA provides will affect the way that you develop the application interface.
Appendix A: SOAP Data Types and Data Type Conversions This short appendix tells what data types SOAP supports and how to convert data from existing languages to a format that SOAP will understand. It’s a quick reference with code snippets that show how to perform the required work. There aren’t any complete programming examples.
Appendix B: Microsoft Biztalk and SOAP Although Microsoft isn’t the only company working with SOAP, for readers of this book, Microsoft will probably be a major supplier of SOAP technology. Biztalk provides access to Microsoft’s XML and SOAP technology implementations, so a discussion of this important server product is essential. This appendix provides an overview of Biztalk as a whole, plus specifics on how this server will provide XML and SOAP services.
Appendix C: Third-Party Tool Reference Because SOAP is so new, you’ll want to know what types of tools are available from third parties. This appendix provides a list of the tools that are available at the time of writing. This list isn’t complete, but it does represent some of the better tools available for SOAP development today.
Appendix D: SOAP for Visual C++ Developers Most of the examples in this book concentrate on Visual Basic or some form of scripting. This appendix discusses SOAP for the Visual C++ developer. You’ll find example code and a complete discussion of Visual C++ development issues. Like the example in Chapter 4, “Using SOAP to Create a Simple Application,” the appendix example is somewhat simple, but the discussion is in depth. We’ll talk about issues such as testing and SOAP error handling, as well as the creation of client-side code.
On the Web We’ve provided all the source code for the examples in the book at an easy to find website. Just go to www.quepublishing.com and check out the site. There you will find easy to download executables that contain all the source code separated by chapter. Everything you need to complete the exercises is right there.
Equipment Used for This Book
Intended Audience I wrote this book for the developer who has a good understanding of programming principles and has worked with distributed applications sometime in the past. In other words, a complete novice will have a difficult time understanding the material, but someone with a little experience will get some information from it almost immediately. All of the examples assume that you know something about Visual Basic or Visual C++ programming and have worked with DCOM or CORBA in the past. You need to understand terms like interface because the theory sections in this book are short and concentrate only on SOAP principles. As the book progresses, the topics become more difficult and the anticipated understanding level of the reader increases. By the time you reach the remote database programming example (Chapter 8, “Providing Remote Database Access”), I’m assuming that you’re at least an intermediate to advanced reader who has spent some time developing remote database applications. You must understand how to work with databases. Although the chapters do provide details on constructing the applications, you won’t find any information about working with the database managers or procedural steps for constructing the tables. All that I provide is the schema you need to build the application. This doesn’t mean that I’m going to bury you with arcane information; this book covers the various topics in simple terms that you’ll find easy to understand. The purpose of this book is to discuss as many SOAP issues as possible in terms that everyone will understand. However, a beginner will still very likely get lost as the book progresses because there’s little in the way of introductory information.
Equipment Used for This Book I made some assumptions while writing the application programming examples in this book. First, you need at least two machines: a workstation and a server. This two-machine setup is the only way that you’ll see SOAP in action and truly know it works as anticipated. During the writing of this book, I used a Windows 2000 and Windows 98 workstation. The servers included Windows 2000 Server and Red Hat Linux. I also experimented with two Web servers running on a NetWare file server, and two PDAs (Windows CE-based Pocket PC and a Palm). I tested all of the examples in this book with Visual Studio 6.0 Enterprise Edition. Most of the examples use Visual Basic, but you’ll also find examples written in Visual C++ and in various scripting languages. None of these examples is guaranteed to work with any other programming language products and none of them will work with the educational versions of Visual Studio.
7
8
Introduction
You must install the latest service packs for all products before the examples will work properly. SOAP is a new technology that relies on the latest versions of many DLLs, especially the XML parsers used in the examples. This book doesn’t support the beta versions of the SOAP toolkits unless I specifically note that I’ve used a beta version. Make sure you use release versions of the SOAP toolkits, especially the Microsoft SOAP Toolkit Version 2.0. Some of the example programs rely on a database manager. I used SQL Server 7.0 for all of the examples in this book. The source code provides scripts that I tested on SQL Server, but may work with other database managers as well. The sample data is in delimited text format and you should be able to import it into any database manager. In short, you can use any database manager you want, but the examples might require modification to do so.
Conventions Used in This Book There are several conventions used within this book will help you get more out of it. Look for special fonts or text styles and icons that emphasize special information. ■
Sometimes I’ll ask you to type something. This information always appears in bold type like this: Type Hello World.
■
Code normally appears on separate lines from the rest of the text. However, there are special situations where small amounts of code appear directly in the paragraph for explanation purposes. This code will appear in a special font like this: Some Special Code.
■
Definitions are always handy to have. I’ll use italic text to differentiate definitions from the rest of the text, like this: A CPU is a required part of your machine.
■
In some cases, I won’t have an exact value to provide, so I’ll give you an idea of what you should type by enclosing it in angle brackets like this: Provide a value for the Name field.
■
You’ll always be able to recognize menu selections and command sequences because they’re implemented like this: Use the File | Open command.
■
URLs for Web sites are presented like this: http://www.microsoft.com.
Notes help you understand principle or provide amplifying information. In many cases, a note is used to emphasize some piece of critical information that you need.
All of us like to know special bits of information that will make our job easier, more fun, or faster to perform. Tips help you get the job done faster and more safely. In many cases, the information found in a Tip is drawn from experience, rather than through experimentation or the documentation.
Conventions Used in This Book
Any time you see a caution, make sure that you take special care to read it. This information is vital. I’ll always uses the Caution to designate information that will help you avoid damage to your application, data, machine, or self. Never skip the Cautions in a chapter, and always follow their advice.
Finding what you need quickly is more important than ever before. Many people work at a pace that they call Internet time. They no longer have the luxury of performing hours of research to find that magic piece of information needed to complete an application. With this in mind, I’ve spent the time you don’t have to find unique information sources on the Internet. This icon will always provide that information so that you can get what you need quickly.
9
CHAPTER
An Overview of SOAP In this chapter What Is SOAP?
13
How SOAP Differs from DCOM and CORBA SOAP, HTTP, and XML
22
Problems Solved by Using SOAP Performance Issues
26
SOAP and the Web Server
28
Why Make the Move to SOAP? Case Study
29
24
28
15
1
12
Chapter 1
An Overview of Soap
At one time in the history of the PC, computing consisted of a single computer using simple applications that relied only on local resources. Eventually, networking changed the way people shared expensive peripherals and data, but the data was still, to an extent, stored locally. As companies grew and became more dependent on the PC, the concept of the Local Area Network (LAN) gave way to the Metro Area Network (MAN) and Wide Area Network (WAN). The client/server architecture of days gone by allowed a limited form of distributed computing. However, there was still a direct connection and all of the data appeared within the confines of one company, so sharing data was still relatively easy. Today, computing is all about distributed applications running on machines that may not know anything about each other. Data is no longer restricted to one company; business-tobusiness communication is now the norm. Consequently, these machines may not even have a direct connection or access the network at the same time. Data sharing occurs between individuals who may not ever meet. In short, there isn’t a direct connection between the provider and the user of data anymore, which means the rules used in the past don’t work well for modern communication needs. Early computers relied on protocols (predefined rules) that ensured safe data transfer between machines that knew what to expect from each other. These protocols work fine on a LAN, MAN, or WAN because there’s a direct connection between machines. A protocol designed for the LAN environment, however, may run into problems when dealing with something like the Internet. For example, the protocol may expect to find another Windows machine when the client really needs to communicate with a Unix server. That’s precisely what’s happened and why we need a new protocol named Simple Object Access Protocol (SOAP). The fact that you’re reading this book means that you already have some idea of why you need SOAP and may even know something about it from a technical perspective. The first section of this chapter is going to tell you more about SOAP—what it does and how it works. This section isn’t going to provide many details, but it will prepare you for the detailed discussion in Chapter 2. I want to present a basic overview of the technology before jumping into the details. SOAP is different from older protocols. It provides some features these older protocols don’t have. For example, unlike Distributed Component Object Model (DCOM) and other binary protocols, SOAP won’t interfere with the operation of the firewall on your Web server. It uses a plain text transfer method that firewalls readily accept. This means that you can maintain the integrity of your company’s Web site, and still get the data you need. Some of these neat new features come at a cost. You lose some of the features provided by the older protocols. For example, security is an issue that many SOAP developers are trying to address as of this writing. The second section of the chapter provides a brief comparison of SOAP to older protocols, such as DCOM and Common Object Request Broker Architecture (CORBA). You’ll learn how SOAP excels and where it provides less than stellar results. SOAP actually interacts with other Web technologies that you may have used in the past. Although SOAP can theoretically rely on any transport protocol, the current toolkit from Microsoft uses the same Hypertext Transfer Protocol (HTTP) used to move Web pages across the Internet. (Note that you can use any reliable protocol to transfer SOAP
What Is SOAP?
13
messages—HTTP is simply the most straightforward and easily accessible method right now, so developers are using it.) In addition, SOAP relies on the eXtensible Markup Language (XML) to format data before moving it from one point to another. The third section of the chapter discusses the interaction between these three technologies. As previously mentioned, Microsoft designed SOAP to address specific problems. Technologies such as DCOM just can’t make the grade in the distributed computing environment of the Internet. The fourth section of the chapter discusses these problems in detail and explains how SOAP addresses them. SOAP changes the performance picture—it uses a new technique to transfer data, so differences in performance are expected. In many cases, SOAP is slower than older technologies like DCOM. You can’t beat a direct binary connection between client and server that relies on optimized data transfers. However, given the distributed nature of applications today, there are also situations when SOAP is faster than binary technologies. In the fifth section of the chapter, we’ll talk about SOAP performance issues. This section also talks about ways of mitigating some of those performance losses by using better coding techniques. Microsoft designed SOAP to work with the Internet—which means you’ll eventually run into a Web server. The sixth section of the chapter discusses how SOAP works with a Web server to handle client requests. We’ll talk about some of the mechanics you’ll need to know later, like common storage locations for files and some of the implementation details. The final section of the chapter discusses a topic that many of you will ask about once you see everything required to move to SOAP. It’s important to know why this move is so important and how you’ll benefit from it. SOAP is a great technology with a very important purpose for your company. This section discusses why you need to include SOAP in your programming toolbox.
What Is SOAP? SOAP is a lightweight communication protocol based on the eXtensible Markup Language (XML). It allows applications and components to exchange data. As mentioned in the introduction, SOAP currently relies on HTTP as a transport protocol, but could use any reliable transport protocol to transfer data. In addition, SOAP is useful for many data transfer needs, including LANs, WANs, and MANs—it’s not just for the Internet.
Learning XML is an essential part of learning SOAP. You don’t have to become an XML guru, but knowing the basics is a requirement. Many XML Web sites offer tutorials and other information about this technology. For example, DevelopMentor (http://www. develop.com/dm/dev_resources.asp) provides good tutorials on this and other distributed application topics from a Microsoft perspective. W3Schools (http://www. w3schools.com/ xml/) provides detailed tutorials in small segments. Another good place to look for XML training is Courses in XML by QTrain (http://www.qtrain. net/). Once you finish the basic XML tutorials, you’ll want to visit XSLT.com (http:// xslt.com/resources_tutorials.htm) and learn more about data transformation
Ch
1
14
Chapter 1
An Overview of Soap
techniques. ZVON.org (http://www.zvon.org/index.php?nav_id=2) includes tutorials on both XML namespaces and the eXtensible Stylesheet Language Transformations (XSLT).
Despite what you may have heard, SOAP isn’t another conspiracy by Microsoft to take over the world—it’s a protocol supported by many vendors. Some of the most notable contributors to SOAP are Ariba, Commerce One, Compaq, DevelopMentor, HP, IBM, IONA, Lotus, SAP, and UserLand. Vendors who support SOAP hope it will eventually gain standards status. The World Wide Web Consortium (W3C) is currently discussing SOAP. You can read the W3C comments at http://www.w3.org/Submission/2000/05/Comment and see the initial specification at http://www.w3.org/TR/SOAP/. Since SOAP is a new protocol that isn’t tied to a particular operating system, the vendors working on it are free to add features that make SOAP especially suited to distributed application use. Three of the features that make SOAP attractive are: ■
No ties to existing component technologies—SOAP will theoretically work with any platform.
■
No ties to a particular programming language; you can use SOAP with any language capable of outputting text. (You can also use a special toolkit to output the formatted text as well, seen in Chapter 4.)
■
Easy to learn and simple to extend.
You’ll see as the book progresses that these three points are important because they’ll affect your perception of SOAP as an application solution. For example, the fact that vendors haven’t tied SOAP to a particular programming language means that you’ll very likely see a wealth of toolkits on the market. These toolkits may not all produce compatible code. In addition, they may rely on server or client-side components that aren’t compatible with other SOAP implementations. Problems like these will become less noticeable as SOAP nears standardization. The last point is important as well. SOAP really is easy to learn. It’s verbose, which means that some of the code listings you see become quite long and look complicated, but the underlying technology is simple. The complexity that you’ll see as we work with SOAP throughout the book comes from the various extensions that Microsoft and other vendors add. These extensions provide additional flexibility and allow you to tailor a SOAP implementation to specific needs.
SOAP is a very hot topic right now because it offers so much. However, getting the latest news can prove difficult because the specifications change so quickly. You can usually rely on news Web sites and newsgroups to provide updated information on a regular basis. For example, the SOAP News Web site at http://soap.weblogs.com/ provides up to the minute information about this new standard. DevelopMentor and other organizations provide list servers that discuss SOAP. The main public SOAP newsgroups appear on the
How SOAP Differs from DCOM and CORBA
15
Microsoft server (news.microsoft.com). You’ll find a wide range of topics in newsgroups like microsoft.public.msdn.soaptoolbox, microsoft.public.xml. soap, and microsoft.public.xml.soapsdk. Ch
SOAP is intertwined with other technologies as well. We’ve already discussed its connection with XML (and by extension, XSLT ). Creating a SOAP implementation is only the first step. If you want others to use your implementation, you need to advertise it in some way. As the book progresses, we’ll talk about the significance of Universal Description, Discovery, and Integration (UDDI). You can find out more about this technology at http://www.uddi.org/. The main purpose of UDDI is to allow vendors to advertise services publicly, including those that rely on SOAP. Another technology that figures prominently in the SOAP picture is Web Services Description Language (WSDL). We’ll discuss this technology in detail in Chapter 2. WSDL provides a description of objects and documents on a Web server in the form of a schema. It also allows advanced features of some programming IDEs to display a list of functions applicable to a particular object. Look at http://www-106.ibm.com/developerworks/ library/w-wsdl.html to find out more about WSDL.
How SOAP Differs from DCOM and CORBA SOAP, like DCOM and CORBA, is a wire protocol. The term wire protocol indicates that applications use SOAP to move data from one place to another. Many people assume that because COM and DCOM have similar names that they’re the same kind of protocol. COM is actually a specification that tells how to create components—DCOM enables those components to interact across a network. It’s an important difference to understand because Microsoft designed SOAP to overcome some of the limitations of DCOM, not to replace it or replace COM itself. The first two sections below provide a comparison of SOAP with DCOM and CORBA. We’ll look at how the protocols compare from a result and method perspective. The essential difference between SOAP and the other two protocols is that SOAP uses plain text, while DCOM and CORBA use a binary format. A third section discusses the Java Remote Method Invocation (RMI). This is a relatively simple wire protocol that’s designed to support Java. It differs from DCOM and CORBA because Java RMI is only designed to support Java—these other protocols support any language. The Java RMI is still a binary protocol, so it differs from SOAP in that regard.
DCOM Wire Protocol The DCOM Wire Protocol, also known as the DCOM Network Protocol or DCOM for short, is a high-level protocol based on the Distributed Computing Environment (DCE) Remote Procedure Call (RPC) Network Protocol and implemented by Microsoft. DCOM relies on several lower-level protocols to accomplish its work. Like SOAP, DCOM oversees the data transfer; it doesn’t perform the mechanics of moving the data. As an enabling
1
16
Chapter 1
An Overview of Soap
protocol, DCOM ensures that the client and server can talk to each other, but doesn’t participate in the conversation or manipulate the data.
Microsoft didn’t create the DCE RPC Network Protocol specification. The Open Software Foundation (OSF), which is now part of the Open Group, created it. You can find out more about the DCE RPC Network Protocol specification at http://www.osf.org/. Find a good RPC overview at http://www.ja.net/documents/ NetworkNews/ Issue44/RPC.html. There’s an article about the inner workings of DCOM at http:// www.microsoft.com/msj/0398/dcom.htm. The main Microsoft DCOM Web site is at http://www.microsoft.com/com/tech/DCOM.asp.
Figure 1.1 shows a generic DCOM component setup. We aren’t doing anything fancy here since the idea is to learn how the connection between the client and server works. Table 1.1 tells you about the various components shown in Figure 1.1. Notice that the message goes through quite a few transition layers. These layers figure prominently in the performance discussion later in the chapter. Figure 1.1 Internet Explorer versions 5.5 and above can display SOAP messages in a simple format.
ServerSide Component
Client OLE32
Service Control Manager (SCM)
Proxy
COM Runtime
Service Control Manager (SCM)
Stub
COM Runtime
Security Provider
Security Provider
DCOM Network Protocol (DCE RPC Network Protocol)
DCOM Network Protocol (DCE RPC Network Protocol)
Protocol Stack
Protocol Stack
Winsock Driver
Winsock Driver
User Datagram Protocol (UDP)
User Datagram Protocol (UDP)
Internet Protocol (IP)
Internet Protocol (IP)
Ethernet Driver
Ethernet Driver
How SOAP Differs from DCOM and CORBA
Table 1.1
17
Components of a Typical DCOM Data Transfer
Component
Description
Client
Originates requests to the server for resources and support. DCOM assumes this is a standard desktop application operating on a LAN.
OLE32
DLL containing the methods used to create an instance of an object (along with a wealth of other functionality).
Service Control Manager (SCM)
Creates the initial connection between the client and server. DCOM only uses the SCM during object creation. After the client and server establish a connection, the SCM steps out of the picture. The SCM represents one of the handshaking elements that we’ll talk about later in the performance section of the chapter.
Proxy
The server’s presence within the client’s address space. The proxy is actually a table of interfaces. Windows creates and manages it at the request of the COM run time. The proxy allows the client to think that the server is local, even though the server is located on another machine.
COM Runtime
Operating system elements that host objects and provide client/server communication. The COM runtime is part of any COM related scenario—both in-process and outof-process—local and remote.
Security Provider
The security provider logs the client machine into the server machine. The operating system determines the type of security providers available. Some security providers will also protect all data transferred between the client and server in some way—usually using encryption.
DCOM Network Protocol (DCE RPC Network Protocol)
Defines a protocol for creating a connection with a remote server. In addition to implementing a component protocol, this block contains all the elements to implement the Object Remote Procedure Call (ORPC) specification at an application level. This is the wire protocol that’s the target of this section of the chapter.
Protocol Stack
Actual network communication requires more than just one protocol—there are network-related protocols to consider as well. The protocol stack consists of all the protocols required to create a connection between the client and server, including network specific protocols like transmission control protocol/Internet protocol (TCP/IP). Figure 1.1 shows a typical protocol stack consisting of a Winsock (Windows sockets) driver, user datagram protocol (UDP), IP, and an Ethernet driver. Not shown is the Ethernet network interface card (NIC) actually used to create the physical connection between the client and server.
Ch
1
18
Chapter 1
An Overview of Soap
Table 1.1
Continued
Component
Description
Stub
The client’s presence within the server’s address space. Windows creates and manages the stub at the request of the COM runtime. As far as the server is concerned, it’s working with a local client.
Server
The COM object that the client has requested services and resources from.
Figure 1.1 is typical of a binary wire protocol like DCOM and even CORBA. As you can see, the design is somewhat complex, but works well in a LAN, WAN, or MAN environment. The point is that DCOM is an enabling technology, not a data manipulation technology. You can view it as a sort of traffic cop and trail guide rolled into one. DCOM uses a specific procedure to ensure reliable communications between machines. This procedure relies on a lot of handshaking to ensure that each phase of the data movement process occurs as anticipated. Contrast this with SOAP, which sends the data and simply assumes that it will arrive at the remote location. Here are the steps that DCOM follows to ensure a safe data transfer between machines: 1. Client issues an object creation call. The call must include both a class ID (CLSID) and a server name (along with any information required to log onto the server). As an alternative, the client can issue a standard call that OLE32.DLL will resolve to a remote location based on a registry entry, or the client can use monikers. 2. OLE32.DLL calls upon the client side SCM to create a connection to the server machine because it can’t service the call locally. 3. DCOM creates the required packets to send information from the client to the server. 4. The server-side SCM creates an instance of the desired server-side component and returns a pointer of the object instance to the client. 5. The server-side SCM calls upon the COM runtime to create a stub for the component to interact with. 6. The client-side SCM calls upon the COM runtime to create a proxy for the client to interact with. 7. The SCM returns a pointer to the proxy to the client. 8. Normal client and server-side component communications begin. As you can see, DCOM provides a robust data communication environment that relies on a client and server that exist at the same time and have a direct connection. It’s important to understand that DCOM is still an optimal technology when reliability is more important than flexibility. DCOM provides a secure and reliable environment that SOAP can provide. On the other hand, the complexities of DCOM make it difficult to use over the Internet. There are simply too many requirements that DCOM has to satisfy before communication can occur.
How SOAP Differs from DCOM and CORBA
19
CORBA IIOP CORBA and DCOM have existed side by side for an eternity in computer time. Each technology has proponents and detractors who protect their point of view with the vigor of religious zealots. However, many developers view CORBA as simply an alternative to DCOM and therefore feel that CORBA has the same limitations. The truth is slightly different. CORBA is more like SOAP than DCOM when it comes to design goals, even though CORBA is a binary protocol. There are two main pieces to this architecture, just like COM and DCOM in the world of Windows. CORBA, like COM, provides a specification for component services. There are several CORBA implementations on the market. IBM provides the System Object Model (SOM) and Distributed SOM (DSOM) architectures. Likewise, Netscape offers the Open Network Environment (ONE) platform. Each CORBA implementation differs a little in low-level details, which we won’t discuss in this chapter since they aren’t important in the overall comparison of CORBA to SOAP. Unlike COM, the Object Management Group (OMG) designed CORBA to run on more than one operating system. In addition, OMG looked at Internet communication as an important CORBA feature from the outset. CORBA tends to perform better than DCOM on the Internet, but still provides less than acceptable performance in most cases. These design differences make CORBA more like SOAP than DCOM when it comes to design goals. You can find out more about OMG at http://www.omg.org/. One of the best Web sites to visit for CORBA details is Distributed Object Computing with CORBA Middleware (http://www.cs.wustl.edu/ ~schmidt/corba.html). This site includes a copy of the CORBA specification, a tutorial, and some great overviews of the technology. There’s an interesting CORBA frequently asked questions (FAQ) site at http://www.aurora-tech.com/corba-faq/. There are FAQs for all the CORBA and IIOP components on this site, including answers on IIOP elements like the Interoperable Object Reference (IOR). The OMG sponsored ORB Interoperability Showcase appears at http://corbanet.dstc.edu.au/.
CORBA, like COM, is a local component implementation. It requires a wire protocol to cross machine barriers. That’s where the Internet Inter-ORB Protocol (IIOP) comes into play. Unlike DCOM, a protocol designed for LAN use alone, OMG developed IIOP for Internet use. IIOP, like DCOM, is the wire protocol that enables component communication between remote applications. In fact, you could almost replace the block names in Figure 1.1 with CORBA and IIOP equivalents and see the technology in operation—the ideas are the same, the names and precise implementation details are changed.
If you want to know how the DCOM and CORBA technologies differ from a block diagram perspective, compare the CORBA block diagram (Figure 1.2) at http://www. cs.wustl.edu/~schmidt/corba-overview.html with Figure 1.1 in this chapter.
Ch
1
20
Chapter 1
An Overview of Soap
You’ll see definite similarities in the two architectures. For example, you can view the COM proxy and stub performing a function similar to the CORBA Interface Definition Language (IDL) stub, Dynamic Invocation Interface (DII), IDL skeleton, and Dynamic Skeleton Interface (DSI). Note that CORBA uses more layers to accomplish the same goals as DCOM—some developers say this makes CORBA slower than DCOM even on a LAN (your mileage may vary).
IIOP provides most of the same functionality of DCOM. Unlike SOAP, both IIOP and DCOM can transfer a variety of non-ASCII data types, such as integers and objects. The same binary features that restrict IIOP and DCOM from getting through firewalls allow them to transfer native data types. This makes binary data transfers more efficient—they require fewer data packets. CORBA and DCOM are both stateful protocols. A user establishes a connection with the server and that connection remains in place during the entire session. This ensures the user’s state information remains intact and enhances the developer’s ability to write applications that interact with the user. HTTP is a stateless protocol. Consequently, SOAP can’t maintain state information, which causes problems. For example, SOAP can’t support properties because the use of properties infers maintained session state information; something that HTTP can’t support. Another way that CORBA and DCOM are similar (although definitely not equal) is in the area of security. If you want to secure a SOAP data transfer, you need to rely on a third party product,such as Secure Sockets Layer (SSL). CORBA and DCOM both include built-in security. Unfortunately, security settings often cause problems for both CORBA and DCOM developers. Getting security set high enough so that no one can see your data, yet low enough to get past firewalls is difficult. In addition, DCOM is especially susceptible to security-setting errors that result in a loss of communication. If the server and the client don’t agree about the level of security to support, then communication doesn’t take place. Obviously, CORBA and DCOM are quite different internally. The packets they produce are not compatible, some data representations differ, and they use different methods to accomplish the same tasks. It’s important to understand that these differences cause problems for distributed application developers because a company that uses CORBA can’t communicate with a company that relies on DCOM, and vice versa.
Java RMI I included Java Remote Method Invocation (RMI) in the chapter because it’s an important adjunct to CORBA. It isn’t a separate or unique wire protocol because it’s built on CORBA, but Java RMI does include some features you need to know about. The most important feature is simplicity. The creators of Java RMI recognized that many developers viewed CORBA as difficult to use and error prone, so they created an easier to use wire protocol in the form of Java RMI.
How SOAP Differs from DCOM and CORBA
One of the better overviews of Java RMI appears at http://www.java.sun.com/ marketing/collateral/ rmi_ds.html. I found the diagrams a little difficult to see, but, otherwise, the information is easy to understand. You’ll also want to view the more detailed Java RMI white paper at http://www.java.sun.com/marketing/ collateral/javarmi.html and the specification at http://www.java.sun.com/ products/jdk/1.1/ docs/guide/rmi/spec/rmiTOC.doc.html.
All you need to do is spend a little time with Java RMI to realize the simplicity it brings to the development environment. The protocol automatically takes care of many of the lowlevel details you normally need to consider. For example, you can gain access to a remote object by looking it up in a name facility provided by Java RMI, or by receiving the reference as a return value or method invocation argument. In addition, the protocol doesn’t need to consider language or platform differences because communication occurs directly between two Java Virtual Machines (JVMs). You can see that Java RMI is a simplified form of CORBA by looking at the block diagram and architectural discussion at http://www.java.sun.com/ products/jdk/1.1/docs/guide/rmi/spec/rmi-arch.doc.html#200. One thing that might escape your notice at first is that Java RMI also inherits all of the good features of Java. This means you gain access to niceties like garbage collection. When you develop an application using CORBA or DCOM, you have to allocate objects and remember to destroy them manually. Java RMI still requires that you allocate the objects, but the runtime automatically destroys objects that go out of scope (they’re no longer referenced by other objects). The first feature that I noticed beyond simplicity is that the Java RMI doesn’t require a large support library. You don’t have to install anything extra—everything comes with the JVM. Many developers I know have had horrifying experiences getting DCOM to work because of DLL hell. DLL hell is an evil giant of a problem in which older applications either won’t run with new versions of DLLs or they overwrite those DLLs and cause new applications to fail. The fact that Java RMI appears as part of the whole Java package and there’s little chance for incompatibilities to creep in can save more development time than you’d ever imagine. Java RMI also includes a unique feature—the ability to move behaviors between machines using inheritance. (Normally, Java RMI extends a default class; but, using special programming techniques allows you to extend other classes as well.) This feature allows a developer to add a new behavior to an existing class. For example, an existing class may perform financial analysis, but you might want it to include sales tax or other added calculations. A new behavior would allow an existing class to perform this task. Contrast this with CORBA and DCOM, which allow a developer to call methods in existing object or move objects around, but not augment existing object behavior. Simplicity comes with a high price in this case. For one thing, you’re still dealing with a binary protocol. It’s still impossible to get messages past firewalls in some cases. In addition, you have less control over the environment when problems occur (simplicity gets in the way).
21
Ch
1
22
Chapter 1
An Overview of Soap
Another problem is that Java RMI is understandably a Java only solution. As long as you’re willing to make your programming language Java, you’ll be fine. However, Java isn’t noted for its high execution speed and is, therefore, less suitable than other languages for some types of desktop and automation applications. Java RMI passes all parameters by copy rather than by reference. This means a remote method can’t change the value of an argument you pass; it must provide a return value in other ways. Passing parameters by copy is the Java RMI method of handling some of the intricacies of marshalling. Java RMI always passes objects, on the other hand, by reference. The client gains access to an interface, not an object implementation.
SOAP, HTTP, and XML I can already hear some of the SOAP savvy among you sighing that I chose to lump SOAP, HTTP, and XML together. It’s true that these protocols are useful as separate entities and don’t necessarily depend on each other. However, the real world situation today is that developers do use them together, so for the sake of simplicity, I’m lumping them together for now. So, what does a SOAP message look like? It depends on the technologies that you use with SOAP, which is why I specified the three technologies I’m using up front. Here’s a simple example—so simple, in fact, that you probably wouldn’t use this form in a real message. Mueller John
This example points out several similarities between Web technology that you’re already familiar with and SOAP. Notice that everything has an opening and closing tag—that’s the XML influence on SOAP, but also has it’s roots in the HyperText Markup Language (HTML) we’re all familiar with. A SOAP message always appears within an envelope. The envelope can have a header, just like an HTML document would, but this isn’t required. The message must have a body. The message content appears within the body as shown here. We’re making a request of the GetPerson function for a person whose last name is Mueller and first name is John. You can extend all of these tags. In fact, you’ll likely extend most of them when creating standard SOAP messages. For example, the envelope tag will probably look something like this in a real message.
The second parameter, in this case xmlns, is the namespace used for the envelope. A namespace reference tells where to find definitions for classes that contain functions used
SOAP, HTTP, and XML
23
within the message. So, let’s expand this principle to the GetPerson function mentioned previously. Here’s what the body of the example message might actually look like. Mueller John
The xmlns tag contains the location of the MyObj object on your company server. This object contains the GetPerson function used in this example. As you can see, a SOAP message could quickly begin to look complex, but it’s simple in design.
Remember that you must always pair all XML and SOAP tags; there’s always an opening and closing tag. In cases in which you don’t need to include any data between tags, you can signify the closing tag by adding a slash at the end of the tag like this example of a paragraph tag: “
”.
We’ll discuss namespaces in more detail as the book progresses. For now, all you need to know is that SOAP messages have a specific format that looks familiar if you’ve worked with XML or even HTML in the past. OK, I’ve shown you the XML and SOAP part of the picture; so where does HTTP come into play? SOAP uses HTTP as a transport protocol—the method for getting a message from one place to another. A SOAP message transaction normally occurs in two phases. The first appears within an HTTP request message. The client requests data from the server, just as it would for a Web page. The server places the requested data within the SOAP message that Web server packs into an HTTP response message. The Web server uses the same technique as it would normally use—only the content of the HTTP message differs from normal. A standard browser can view the SOAP message you create. For example, Figure 1.2 shows the output from the sample code we looked at earlier. Since we didn’t create an XSL file for this example, the browser uses the default template. This template shows you the content of the SOAP message with keywords and content highlighted in a variety of colors. Normally, you’d pass the SOAP response message to a client side application for interpretation and display.
Now that you have a taste of what SOAP is like, you might want to view the SOAP specification. The current specification, as of this writing, is version 1.1. You’ll find it at http://static.userland.com/xmlRpcCom/ soap/SOAPv11.htm (among other places). Not only will the specification fill you in on additional details, but you’ll also see some example messages that contain the HTTP header, as well as the SOAP message. I chose to concentrate on just the SOAP portion of the message in my example listings.
Ch
1
24
Chapter 1
An Overview of Soap
Figure 1.2 Internet Explorer versions 5.5 and above can display SOAP messages in a simple format.
Problems Solved by Using SOAP Some developers are dubious about the benefits of using SOAP because they see it as a complex way to transfer data from one location to another using unsafe methods. For example, SOAP doesn’t have any form of internal security and you can’t use it when you need transactions. One developer even stated that SOAP was a poor choice for any kind of database work, unless you include safeguards at both ends of the transaction. (This is where all of those classes on three-tier design come into play—SOAP should access only the business logic on your server, not the database itself.) No matter what the developer community comes up with for a data transfer protocol, there are always going to be negative points to consider. SOAP is a “fire and forget” solution to transferring data on the Internet that works well in some cases and not at all in others. I’m not going to convince you in this section that SOAP is the perfect solution for every problem. You shouldn’t consider it the solution for all problems either—SOAP has limitations. All arguments aside, SOAP does solve two very important problems and a host of smaller problems. The first problem is the matter of network security versus data transfer. DCOM and CORBA are both binary data transfer protocols. This means they both rely on nonASCII characters that a Web server firewall could interpret as executable code and packet formats that the firewall will interpret as invalid HTTP packets. In both cases, the firewall throws the packets out because it’s designed to work that way. Since the Web server requires the firewall to keep crackers out, there’s an obvious problem with using binary data transfers. SOAP solves the problem by using ASCII in an HTTP package—something the firewall will instantly recognize as data and pass along to the Web server for processing.
Problems Solved by Using SOAP
The second major problem is one of compatibility. If you use a binary protocol, the client and server must both use the same method for retrieving the data from the packet. Binary compatibility is easy to achieve when programmers in one company maintain both the client and server. However, distributed applications aren’t one-company solutions. You now have other companies in the form of partners and clients to consider. Transferring ASCII data within an HTTP packet between companies is easier and less error prone than binary methods. SOAP combined with UDDI makes it easy for customers to find you and use the resources you provide. Services such as online ordering are easier to implement when everyone has common grounds for participation. It’s ironic that one of the features that makes binary protocols so useful on a LAN actually gets in the way when working in a distributed environment. DCOM provides a wealth of security setting features that enable you to fine-tune your approach to keeping your data secure. For example, you can choose to check a user’s identity only once at the outset of a communication. On the other hand, you can ask the user for identification information before each transaction. While this does aid in security, it also means that DCOM is doing a lot of handshaking with the application; a feature low bandwidth connection like the Internet won’t support. SOAP doesn’t implement any security internally, so there isn’t any security-related handshaking to worry about when using this protocol. SOAP relies on security like SSL that’s designed for Internet use, so again, the effect of security on system performance is minimal. Proxy servers can also present problems for the developer. The target for a remote call might not have a publicly routable IP address. This means that a proxy server will have to route the packets to the correct internal address—adding yet another failure point where incorrect configuration information can keep an application from working. SOAP is interpreted by a listener component that has a direct public IP address. You still gain security benefits because the listener doesn’t do anything with the data—an internal (nonaccessible) component works with the data after the listener interprets it. There’s plenty of opportunity to check the data for security and corruption problems as part of the translation process. Older wire protocols often rely on nonstandard TCP/IP ports for communication so they don’t interfere with standard network traffic. One of the more common ports is 135, but these protocols can use other ports as well. What happens if a network administrator reconfigures the firewall to close the port the application needs? It may take a long time to find the problem because this is an unexpected error. DCOM and CORBA will simply report they can’t contact the server—leaving you in the dark as to the cause of the problem. SOAP solves this problem by using standard communication ports that a network administrator is unlikely to close. This is the tip of the iceberg. Microsoft and other vendors designed SOAP primarily to solve the problems of Internet communication. It isn’t too surprising that it excels in the kinds of communication that you’d expect to perform on the Internet. You’ll run into problems, however, if you assume that SOAP is the only answer. DCOM, CORBA, and Java RMI still provide many useful features not found in SOAP that address distributed application programming within a company or tightly knit group of companies.
25
Ch
1
26
Chapter 1
An Overview of Soap
Performance Issues It’s important to realize that SOAP represents a new method of transferring data from one point to another. Unlike binary technologies used with LANs, SOAP travels between distributed applications on a public network using ordinary text. The differences between SOAP and other transport methods like DCOM Wire Protocol mean that you can’t rely on old technique to ensure swift transmission of data. The following sections will look at the two sides of the SOAP performance coin. On the one hand, there’s good reason to suspect that SOAP won’t perform well as binary data transfer methods. For one thing, it’s a verbose data transfer method. On the other hand, SOAP doesn’t require complex translation methods on both ends of the wire. Plain text can get through firewalls with relative ease and is understandable by all platforms.
Tradeoffs of Using SOAP There are many performance tradeoffs to consider when working with SOAP. One of the most obvious performance tradeoffs is the compactness of binary data when compared to the ASCII text used by SOAP. Although ASCII text is generally compatible with any system on the market, it’s also bulky. (Look at the simple coding examples in the “SOAP, HTTP, and XML” section of this chapter for details.) Unfortunately, there isn’t any direct way to alleviate this problem. You can’t compress the data without losing some of the compatibility that SOAP is supposed to provide. The size of your messages is a concern because SOAP works on low bandwidth networks for the most part. The Internet isn’t as fast as a LAN or other local dedicated connection. In fact, large message sizes will also affect local server resources (more storage space and longer handling times). Overall, message size is the one performance problem that you’ll have a hard time solving. You can secure a SOAP transaction if you want to. For example, since SOAP relies on HTTP as a transport protocol, you have access to all of the same security measures that you can use with HTTP like Secure Sockets Layer (SSL). However, as anyone who has ever used SSL will tell you, connection speed suffers greatly. It doesn’t take long to realize that you can either have a slow secure SOAP connection or a fast connection that gives everyone access to your data. SOAP lacks the alternatives of other protocols such as DCOM provide. Another potential problem is a two-edged sword. SOAP messages consist of plain ASCII data in XML format. This means that an application on the client has to translate binary data into an ASCII format for transport. Another application on the server end has to translate the data from ASCII format back into a binary format. The translation of data takes time. Just how much time depends on the amount of data you need to transfer and the original data format. Transferring graphics or other strictly binary data is an expensive proposition, while database fields are less problematic if they contain a lot of string data to begin with. (You’ll notice that many SOAP examples are queries for data from a database, but few show record updates or transfers of data that’s normally interpreted from a binary format.) We’ll see in the next section of the chapter that there’s an up side to the use of ASCII text to transfer your data that could become a performance boost.
Performance Issues
27
When SOAP Is Faster The SOAP performance picture isn’t entirely grim. In fact, it can be a faster transport mechanism than those binary alternatives. Consider for a moment that we’re talking about a distributed application environment with firewalls and other obstacles to overcome when transferring data from one point to another. A SOAP message requires no translation to get past the firewalls—binary data may very well require translation, if it gets past the firewall at all. Here’s the second part of the two-edged sword mentioned in the previous section. There’s no simple answer to the question of whether it takes more time to transfer binary or ASCII data over the Internet because there are too many variables to consider. Binary data transfers could take more time if a firewall configuration issue forces the server to resend the data multiple times. On the other hand, you have the size of the SOAP message to consider. The only way you’ll know how well SOAP is going to perform is to test it in actual use and make a comparison to similar tests using technologies such as DCOM. If performance is a major criterion, then you need to perform such tests. Another, more subtle, performance difference is that SOAP requires no handshaking. When you’re working with a low bandwidth, high latency media like the Internet, handshaking can become a real performance problem. DCOM requires keep-alive messages and other housekeeping messages that don’t significantly affect the performance of a LAN, but eat precious bandwidth on most Internet connections.
Sometimes performance tuning an application means getting creative, rather than accepting you can’t do much to improve performance. For example, reducing the length of string fields can mean the difference between transferring one or two packets. Reducing the number of packets will increase performance. If you can’t reduce the size of the strings, then try to use keywords in place of some of the fields. For example, if a field can only contain a limited number of selections, try passing a number instead of a string. The component on the receiving end can translate the number back into a string as part of the process of reading the SOAP message.
SOAP can also create smaller messages in some cases. A binary protocol uses large object reference tags in some situations. The resulting message is actually larger than the SOAP message. A DCOM packet contains information other than the data that you're asking the protocol to transfer, which is the reason for the increase in message size. Admittedly, this is only the case when the object is relatively small. However, if you're making a lot of small requests, rather than a few large requests, the difference in request header size can make a difference.
Don’t get the idea that there’s any free lunch when it comes to SOAP. Yes, DCOM uses a large object reference tag that could adversely affect performance, but DCOM also supports features like properties. SOAP and DCOM aren’t equivalent; DCOM provides more functionality than SOAP for the near future, so you may still find that you need DCOM functionality even if using it does hurt performance.
Ch
1
28
Chapter 1
An Overview of Soap
SOAP and the Web Server When you’re working with SOAP, you’ll eventually need to deal with a Web server. Since Web servers differ in capability and features, it’s almost certain that you’ll run into problem getting SOAP to work with some of them properly. For example, most developers worry about interoperability between Internet Information Server (IIS) and the Apache server used by many UNIX servers right now. You’ll find two common problems working with Web servers from different vendors. First, there’s the problem of data reception. A Web server commonly uses a listener component to pick up SOAP messages and act upon them. If every listener acted the same way, there wouldn’t be a problem. However, there are nuances of difference between SOAP listener versions, so developers often run into compatibility problems. The amount of trouble you run into will depend on the SOAP toolkit you use to create a solution. There are some listener components written in Java that purport to fix these problems, but little idiosyncrasies remain. Another problem is one of translation. Getting the data to the server doesn’t help much if the server can’t interpret it. As we saw earlier, SOAP relies on an XML format to get data from one place to another. What happens if the server is expecting tags in one order and the client provides them in another? Obviously, a smart developer creates components that will interact with the data no matter what order it comes in, but unforeseen problems can occur during the initial design process.
Why Make the Move to SOAP? SOAP isn’t a magic bullet designed to kill the problems that plague most of us in a world of ever-increasing complexity. A distributed application requires time and patience to build— a new technology is unlikely to reduce that complexity to any extent. However, SOAP can reduce the difficulty of implementing a distributed application solution by reducing the number of potential problem areas. The mere fact that SOAP makes it easier to move your data through firewalls is enough to use this technology when creating distributed applications. Debugging is another reason to use SOAP. When working with DCOM and other binary technologies, you can’t really see the data flow between machines. You can see the data before the sender transmits it and you can see it once it arrives at the destination, but you can’t see the part in-between (unless you’re very good at reading the binary data in packets). Since SOAP relies on plain text transmission, debugging is easier because you can see flaws in the data transmission itself. Developers who create many distributed applications often cite compatibility as the reason to use SOAP. Using a binary protocol implies that every machine will understand that protocol equally. In other words, if you choose CORBA IIOP to transfer your data, every machine involved needs to speak that protocol. (One kludge that developers use is to write code that bridges translation differences between different protocols such as DCOM and CORBA, but this is an error-prone method of handling the problem.) SOAP promises to provide a platform independent transfer media since every computer speaks plain text. The only requirement is
Case Study
29
that the computers use the same standard to translate the XML text into data formatted for the application—a much easier requirement to fulfill. A few developers are finding that a hidden reason to use SOAP is development speed. SOAP is a simple protocol when compared to DCOM or CORBA. Not only do you have an opportunity to debug your application with relative ease, but the coding requirements are easier to understand and less error prone. Developers run into fewer configuration related bugs in their applications and spend more time locating real problems. Once SOAP is standardized, you may find that the biggest developer specific reason for moving to SOAP is to get more coding done faster and with fewer errors. Making the developer more efficient always pays big dividends in productivity and developer happiness.
Case Study The main reason to use SOAP today is to create distributed applications that run over the Internet. Yes, you can use SOAP for other types of development and it’ll work fine, but SOAP isn’t a cure all for every programming problem on your network. It’s one solution out of many. DCOM and other wire protocols still have a place in your enterprise and you need to keep that in mind as you develop new applications. So, how are some companies using SOAP today? Many of the pioneers are still in the testing stage as I write this. SOAP is an exciting technology, but the standard is still changing on a daily basis, so many companies have relegated SOAP to research for future products. The bottom line is that you shouldn’t rush into a SOAP project because it looks interesting. One company that I talked with recently has begun using SOAP for a mainstream application and for good reason—SOAP really is the only answer in this case. This company has requested anonymity, so we’ll refer to them as ABC Corporation for the purposes of this case study. The fact of the matter is that this company could just as easily be the one that you’re working in now.
Definition of a Problem The management at ABC Corporation began looking at the company’s bottom line as profits began to drop in a slowing economy. While ABC Corporation had a great sales staff, a good product line, and relatively competitive prices, in some cases they couldn’t get their product out in time to beat competitors. Taking orders by telephone, filling out paper orders, and the like simply cost the company too much time. Management decided that direct business-to-business sales over a computer connection would speed things up considerably and reduce labor costs as well. A faster processing cycle could mean the difference between staying afloat or truly excelling at business. It’s a nice thought and it’ll probably work. The only problem is that the programming staff at ABC Corporation had no idea on how to implement the solution. Their company relied heavily on UNIX servers, while the majority of their customers had Windows NT or Windows 2000 servers. Direct data communication was impossible—at least direct communication in the form expected by management. Automation might prove impossible if the development staff didn’t find a common form of communication.
Ch
1
30
Chapter 1
An Overview of Soap
At first, ABC Corporation tried the old standby of using forms on their Web site and Common Gateway Interface (CGI) scripts to process the orders. Unfortunately, the installation proved less robust than management wanted. It was also slow, error prone, and didn’t consider individual company needs. Although the system might have worked, no one used it regularly because it was cumbersome and time-consuming—typing out a paper form was still easier.
Light at the End of the Tunnel ABC Corporation tried a number of other solutions. All of these solutions failed for one reason or another. The final straw was when they discovered that the Java RMI solution they implemented was too slow and cumbersome. Management decided to call in a consultant. A number of consultants suggested ways to get around the company’s problems, but ABC Corporation finally decided to try SOAP. They decided that SOAP would circumvent the majority of the problems they knew about and the other problems were easy enough to solve using other methods. Development went slow—a concession to being the first on the block with a new solution. However, the company eventually had a workable solution to test. Profitability had to be right around the corner.
Perfection, an Ongoing Process Life is often a twisted path, rather than a straight road. ABC Corporation found that their new solution was less than perfect. Compatibility problems between the Windows implementation of the SOAP listener and the Apache version caused a few hours of nail biting as the consultant and development staff rushed to fix the problems. The new code worked around the compatibility problems, performed a few additional checks, and included some more error trapping. While the application was now stable, it did execute slightly slower than before. Sometimes there isn’t a perfect solution, just one that works. The lesson that ABC Corporation learned in this case is that they should have spent more time looking for a SOAP toolkit that works equally well on Windows and Apache. Some of the problems were outside the control of ABC Corporation or their partners. One of the things that SOAP assumes is a perfect network and we all know that the Internet is far from perfect. Although ABC Corporation has improved system up time and provided their partners with alternative means for submitting orders when needed, glitches still occur. They don’t occur on a regular basis, but even one lost order might result in the loss of a customer. The lesson learned here is that you need to build plenty of redundancy into your SOAP implementation and provide alternative communication methods in the case of failure. ABC Corporation also added an email response to their setup. The customer receives an email when ABC Corporation receives the order and another message when they send the order out. This additional communication is completely automated, so it costs ABC Corporation little and greatly enhances customer satisfaction.
Case Study
ABC Corporation plans to roll out their SOAP solution to all partners in about three months as I write this. The testing phase is still ongoing. They currently have a pilot program for several partners and the results look promising. It would be difficult to place a precise date on return on investment at this point. However, ABC Corporation can process orders faster, with less human intervention, and greater customer satisfaction. Just these changes make the bottom line look a lot better.
31
Ch
1
CHAPTER
SOAP in Theory In this chapter Dissecting the SOAP Message 35 Using SOAP with Your Current Code 45 Discovering SOAP Services 47 Putting Everything Together 50 Using SOAP to Move Data 52 Understanding SOAP Attachments 57 Case Study 58
2
34
Chapter 2
SOAP In Theory
This chapter answers the question “How does it work?” It’s important to know the answers to this question because you’ll want to troubleshoot your SOAP applications after you begin developing them. I’ll provide you with details about the parts of SOAP that I know you’ll need to create applications. The chapter also contains pointers to Web sites that contain deep theoretical information that’s nice to know, but you really don’t need at them this moment to create applications. In short, this chapter serves as a practical guide to SOAP with pointers to the parts that only an engineer could love. SOAP is a new technology seeking the approval of a standards group. What this means to you, as a developer, is that SOAP will continue to change shape until the standards committee finally settles on a set of features. During the next few months you’ll find yourself downloading new feature lists, upgrading your tools, and spending a lot of time reading the latest theory on how SOAP should work. New technology is great because it allows you to perform tasks that you couldn’t perform in the past, but it also requires extra patience when things don’t work as well as they should. The first section of this chapter, “Dissecting the SOAP Message,” tells you how a SOAP message is put together. We’ll begin looking at the entire package in detail, everything from the HyperText Transfer Protocol (HTTP) header to the eXtensible Markup Language (XML) used to create the SOAP message. In the end, you’ll know how a typical message looks so that you can diagnose problems with messages that your applications create. The second section, “Using SOAP with Your Current Code,” provides some basics on retrofitting existing code to use SOAP. Many organizations created DCOM or CORBA solutions for distributed applications needs in the past. This section provides an overview of some of the pitfalls of converting these applications to use SOAP. In the third section, “Discovering SOAP Services,” you’ll learn some of the technologies available for finding a SOAP service you need for an application. The three technologies that we’ll discuss are Discovery of Web Services (DISCO), Web Service Description Language (WSDL), and Universal Discovery, Description, and Integration (UDDI). Section four will put the final pieces of a basic SOAP application into place. I’ll tell you how everything works from a theoretical perspective. Of course, theory is often different from reality, which is why this book has far more hands on chapters than it does theory. We’ll also visit some of the pitfalls that the hands on chapters help you solve. Of course, the whole purpose behind using SOAP is to move data. Section five, “Using SOAP to Move Data,” addresses that issue. It looks at how you deliver data using SOAP. More importantly, it discusses some of the current limitations on moving data using SOAP and how the vendors involved in the SOAP effort plan to address that need. The next section of the chapter, “Understanding SOAP Attachments,” looks at one alternative for moving data from one point to another. The SOAP attachment is just one of several alternative delivery methods currently under discussion. Many developers feel that attachments offer the most promise because they’ll allow SOAP to deliver any type of data. In addition, a developer can encrypt an attachment, improving overall security.
Dissecting the SOAP Message
35
The final section of the chapter contains a case study. We’ll look at how one company used SOAP to overcome vendor relationship problems. SOAP works best in an environment where more than one company needs to exchange information and the data exchange occurs over a public network like the Internet.
Dissecting the SOAP Message You can divide SOAP messages into two basic categories: Requests and responses. The client sends a request to the server. If the server can fulfill the request, then it sends a data message back to the client. Otherwise, the server sends an error message indicating why it couldn’t send a response back to the client. In most cases, the problem is one of security, access, equipment failure, or an inability to find the requested object or data. SOAP messages don’t exist within a vacuum. If you send just a SOAP message, it will never reach its destination. Remember from Chapter 1 that SOAP is a wire protocol—it relies on another protocol for transport. This is the same technique used by other wire protocols, so there’s nothing strange about SOAP when it comes to data transfer needs. The most common transport protocol in use today is HTTP, so that’s what we’ll look at in this section. Keep in mind, however, that SOAP can theoretically use any of a number of transport protocols and probably will in the future. Why Use Another Transport Protocol for SOAP? I keep talking about how you shouldn’t assume that HTTP is the only transport protocol you should use with SOAP. Accepting this statement at face value is one thing, understanding why is another. It’s important to know why you’d need another transport protocol, other than sheer stubbornness in not liking HTTP for whatever reason. The first reason is the one that most businesses will eventually accept as the main reason to change to another protocol. HTTP is a synchronous protocol. In other words, the client and server have to exist at the same time and generally communicate in real time. Not all business applications fit this model very well, which is why technologies such as Microsoft’s Queued Components are popular. Your business may eventually use another transport protocol to gain asynchronous message support. The second reason is security and privacy concerns. The HTTP protocol is very good at what it’s designed to do. However, as various countries begin to grapple with the issues of security and privacy, they may find that HTTP is too open to meet the demands of modern business. While HTTP continues to provide services when you surf the Web, developers may relegate it to the closet in business communication. Finally, there are practical reasons to use some other transport protocol. As we’ll see in the “HTTP Extension Framework” section of this chapter, vendors are currently considering some standard method for showing differences between HTTP standard elements and extensions to that standard. Remote procedure calls (RPCs) fall into the extension category—they aren’t a part of the HTTP standard. Using another transport protocol might become necessary if HTTP extensions become a compatibility problem.
SOAP messages are also a special form of XML. Therefore, in addition to the HTTP wrapper, a SOAP message requires an XML wrapper. All that the XML wrapper does, in this case, is tell the data receiver that this is an XML formatted message. The SOAP part of the message contains all the data; however, SOAP uses XML to format the data.
Ch
2
36
Chapter 2
SOAP In Theory
Figure 2.1 shows a common SOAP message configuration. Notice the SOAP message formatting. This isn’t the only way to wrap a SOAP message in other protocols, but it’s the most common method in use today. Figure 2.1 An illustration of how a SOAP message is commonly encased within other protocols.
The next three sections are going to show you how a SOAP message actually appears during transmission. We’ll use Figure 2.1 as an aid for discussion. It’s the only time we’ll explore a complete request or response in the book because you only need to worry about the SOAP message in most cases. The first section looks at the HTTP portion of SOAP. The second helps you understand where XML comes into play. Finally, you’ll look at the SOAP message itself.
Dissecting the SOAP Message
37
Working with the new capabilities provided by technologies like XML and SOAP means dealing with dynamically created Web pages. Although it’s nice that you can modify the content of a Web page as needed for an individual user, it can also be a problem if you need to troubleshoot the Web page. That’s where a handy little script comes into play. Type javascript:’’+document.all(0).outerHTML+’’ the Address field of Internet Explorer for any dynamically created Web page and you’ll see the actual HTML for that page. This includes the results of using scripts and other page construction techniques. Ch
Viewing the HTTP Portion of SOAP The HTTP portion of a SOAP message looks much the same as any other HTTP header you may have seen in the past. In fact, if you don’t look carefully, you might pass it by without paying any attention. Like most HTTP transmissions, there are two types of headers—one for requests and another for responses. Figure 2.1 shows examples of both types. As with any request header, the HTTP portion of a SOAP message contains an action (POST, in most cases), the HTTP version, a host name, and some content length information. The POST action portion of the header contains the path for the SOAP listener, which is either an ASP script or an ISAPI component. Also located within a request header is a Content-Type entry of text/xml and a charset entry of utf-8. The utf-8 entry is important right now because many SOAP toolkits don’t support utf-16 and many other character sets (at least not as of this writing). You’ll also find the unique SOAPAction entry in the HTTP request header. It contains the Uniform Resource Identifier (URI) of the ASP script or ISAPI component used to parse the SOAP request. If the SOAPAction entry is “”, then the server will use the HTTP Request-URI entry to locate a listener instead. This is the only SOAP specific entry in the HTTP header—everything else discussed can appear in any HTTP formatted message.
The SOAPAction entry currently works with the HTTP protocol, but not other protocols such as Simple Mail Transfer Protocol (SMTP). Vendors involved in the SOAP specification are currently discussing the use of the SOAPAction entry. The SOAPAction entry may have changed a little by the time you read this, so you can use it with other protocols.
An optional request header entry is the mandatory HTTP Extension Framework declaration. You’ll notice that the action for such a request header is different. For example, instead of POST, you’ll see M-POST in the header. This type of header also includes an entry such as Man: “http://schemas.xmlsoap.org/soap/envelope/”. The effects of the HTTP Extension Framework are discussed later in the chapter.
2
38
Chapter 2
SOAP In Theory
UTF stands for Unicode Transformation Format. UTF represents one-way standard method for encoding character. One of the better places to learn about UTF-8 is http://www.utf8.org/. You can find a good discussion of various encoding techniques at http://www.czyborra.com/utf/. This Web site presents the information in tutorial format. The fact remains that you need to use the UTF-8 character set when working with SOAP.
The response header portion of the HTTP wrapper for a SOAP message contains all of the essentials as well. You’ll find the HTTP version, status, and content length as usual. Like the request header, the response header has a Content-Type entry of text/xml and a charset entry of UTF-8. There are two common status indicators for a response header: 200 OK or 500 Internal Server Error. While the SOAP specification allows leeway in the positive response status number (any value in the 200 series), a server must return a status value of 500 for SOAP errors to indicate a server error. Whenever a SOAP response header contains an error status, the SOAP message must include a SOAP fault section. We’ll talk about SOAP faults later in this chapter. All you need to know now is that the HTTP header provides the first indication of a SOAP fault that requires additional processing. There are other applicable status error codes for the response header. For example, if the client sends a standard HTTP header and the server wants to use the HTTP Extension Framework, it can respond with a status error value of 510 Not Extended. The 510 error isn’t necessarily fatal; a client can make the request again using the mandatory HTTP Extension Framework declaration. In this case, an error message serves to alert the client to a special server requirement.
Viewing the XML Portion of SOAP All SOAP messages are encoded using XML. SOAP follows the XML specification and can be considered a true superset of XML. In other words, it adds to the functionality already in place within XML. Anyone familiar with XML will feel comfortable with SOAP at the outset—all you really need to know is the SOAP nuances. Although the examples in the SOAP specification don’t show an XML connection (other than the formatting of the SOAP message), most real world examples will contain at least one line of XML specific information. Here’s one example of an XML entry:
As you can see, the tag is quite simple. The only bits of information that it includes are the XML version number, the character set (encoding), and whether the message is standalone. As with the HTTP header, you’ll need to use the UTF-8 character set for right now. The version number will change as new versions of XML appear on the scene. The standalone attribute determines if external markup declarations could affect the manner in which this
Dissecting the SOAP Message
39
XML document is processed. A value of no means external documents could affect the processing of this document.
We won’t discuss all the XML tag attributes (declarations) in this chapter. You can find a complete listing of these attributes at http://www.w3.org/TR/REC-xml. For those of you who don’t read specifications very well (or prefer not to), look at Tim Bray’s annotated XML specification Web site at http://www.xml.com/axml/testaxml.htm. Another good place to look is the XML.com Web site at http://www.xml.com/. Finally, if you want to see the tools and other resources available for XML, look at http://www.projectcool.com/ developer/xmlz/xmlref/examples.html.
Ch
2 Some developers don’t include all of the XML tag attributes in their SOAP messages. So far, I haven’t seen any problems with leaving the encoding and standalone attributes out of the picture. You should, however, always include the XML version number—if for no other reason than the need to document your code and ensure there are no compatibility problems with future SOAP implementations.
Working with the SOAP Message You’ve already seen several examples of simple SOAP messages. A message consists of an envelope that contains both a header and a body. The header can contain information that isn’t associated with the data itself. For example, the header commonly contains a transaction ID when the application needs one to identify a particular SOAP message. The body contains the data in XML format. If an error occurs, the body will contain fault information, rather than data. Chapter 1 shows a simple example of a SOAP message, along with a detailed description of the various message elements. Now that you have a summary of the SOAP message content, let’s look at some particulars you’ll need when working with SOAP. The following sections will explain some technical details needed to understand the SOAP message fully. The first section tells you about the SOAP data transfer requirements as opposed to those supported by HTTP. The second section tells you about a Web site that allows you to test your SOAP knowledge. The third section explores SOAP fault messages. Finally, the fourth section helps you understand the HTTP Extension Framework.
HTTP and the SOAP Transfer SOAP is essentially a one-way data transfer protocol. While SOAP messages often follow a request/response pattern, the messages themselves are individual entities. They aren’t linked in any way. This means that a SOAP message is standalone—it doesn’t rely on the immediate presence of a server, nor is a response expected when a request message contains all of the required information. For example, some types of data entry may not require a response since the user is inputting information and may not care about a response.
40
Chapter 2
SOAP In Theory
The envelope in which a SOAP message travels, however, may provide more than just a one-way transfer path. For example, when a developer encases a SOAP message within an HTTP envelope the request and response both use the same connection. The connection is created and maintained by HTTP, not by SOAP. Consequently, the connection follows the HTTP way of performing data transfer—using the same techniques as a browser to request Web pages for display.
Testing Your SOAP Knowledge Microsoft recently made two Web sites available to check your SOAP messages. The first accepts SOAP messages, parses them, and provides a check of their validity. You’ll find it at http://www.soaptoolkit.com/soapvalidator/. Figure 2.2 shows what this Web site looks like. Figure 2.2 The SOAP Message Validation site allows you to test your SOAP knowledge and learn about specific message types.
As you can see, there are three panes in this display. The SOAP Message Text window allows you to enter a message that you want to verify. You can also choose one of the valid or invalid samples from the drop down list boxes. These samples can teach you quite a bit about SOAP. They provide examples of what you can and can’t do within a message and the results that you’ll get when performing certain actions. I’ve actually learned how to distinguish some error messages by using this Web site. You don’t need to include the HTTP or XML headers in the SOAP Message Text window, just the SOAP message.
Dissecting the SOAP Message
41
The Parsed Version window shows what the message looks like after the SOAP listener parses it. This window doesn’t tell you about the validity of the message, but it does help you understand the XML formatting better. You can use this window to determine whether the message is well formed. The tags should form pairs that are easy to see in the pane. The use of text coloring also helps you to distinguish between specific text elements. The Message Check Results window shows the results of diagnostics the site performs on your SOAP message. You’ll see error messages in places where the SOAP message doesn’t contain required entries or the entry format is incorrect. When all of the error messages are gone, the SOAP message is ready for use. Of course, this still doesn’t mean the SOAP message will do anything. The Web site doesn’t check the validity of the data within the SOAP message. You can create a perfect SOAP message that still doesn’t work because the serverside component is expecting the data in a different format or even requires other arguments. The second Web site is a generic SOAP listener. You can send a SOAP message to it using an application. The site will test the message much like the Message Check Results window of the SOAP Message Validation site. You’ll find this Web site at http://www.soaptoolkit.com/ soapvalidator/listener.asp. Figure 2.3 shows an error output from this site. This is also an example of the SOAP fault message that we’ll discuss in the next section of the chapter. Figure 2.3 The SOAP Message Listener site validates requests from your applications.
SOAP Fault Messages Sometimes a SOAP request generates a fault message instead of the anticipated reply. The server may not have the means to answer your request, the request you generated may be incomplete, or bad communication may prevent your message from arriving in the same state that you sent it. There are many reasons that you might receive a SOAP fault message. However, you can generally categorize them into four general areas as shown in Table 2.1.
Ch
2
42
Chapter 2
SOAP In Theory
Table 2.1
General SOAP Fault Message Classifications
Message Type
Description
Client
The client generated a request that the server couldn’t process for some reason. The most common problem is that the XML format of the SOAP message is incorrect (malformed). Another common problem is the server can’t find the requested component or the client doesn’t provide all of the information the server needs to process the message. If the client receives an error message of this type, it must recreate the message and fix the problems of the original. The server usually provides amplifying information if it can determine how the client request is malformed or otherwise in error.
MustUnderstand
This error only occurs when you set the SOAP message mustUnderstand attribute to 1. The error occurs when the client sends a request that the server can’t understand or obey for some reason. The server may not understand a request when the client relies on capabilities of a version of SOAP that the server’s listener doesn’t provide. A server may not obey a client request due to security or other concerns. An upgrade of the listener or server-side components usually helps in this situation.
Server
The server couldn’t process the message even though the request is valid. A server error can occur for a number of reasons. For example, the server could run out of resources for initializing the requested component. In a complex application environment, the server may rely on the processing of another server that’s offline or otherwise inaccessible. The client should definitely resubmit the same request later, when this error occurs. The server usually provides amplifying information if it can determine the precise server-side error that occurred.
VersionMismatch
SOAP doesn’t have a versioning system. It does rely on the name spaces that you define to perform useful work. This error message occurs when the SOAP envelope namespace is either incorrect or missing. As of this writing, you should use a SOAP envelope name space that points to “http://schemas.xmlsoap.org/ soap/envelope/” as shown in the source code examples in this chapter.
It’s possible to create other fault categories for SOAP messages. The list in Table 2.2 conforms to the SOAP specification. The only requirement for SOAP fault categories is that they follow the formatting requirements for XML namespaces. A good place to look for additional information about XML namespace formatting is the Namespace in XML document at http://www.w3.org/TR/REC-xml-names/.
When a server returns a fault message, it doesn’t return any data. Look at Figure 2.3 and you’ll see a typical client fault message. Notice the message contains only fault information. With this in mind, the client-side components you create must be prepared to parse SOAP fault messages and return the information to the calling application in such a way that the user will understand the meaning of the fault.
Dissecting the SOAP Message
43
Figure 2.3 shows the standard presentation of a SOAP fault message. Notice that the fault envelope resides within the body of the SOAP message. A fault envelope generally contains a faultcode and faultstring element that tells you which error occurred. All of the other SOAP fault message elements are optional. Table 2.2 provides a list of these elements and tells you how they’re used.
Table 2.2
SOAP Fault Message Elements
Element
Description
faultcode
The faultcode contains the name of the error that occurred. It can use a dot syntax to define a more precise error code. The faultcode will always begin with one of the classifications listed in Table 2.1. For example, the faultcode in Figure 2.3 consists of a Client error code followed by an XMLERR subcode. This tells you that the request message is malformed because the XML formatting is incorrect. Because it’s possible to create a list of standard SOAP faultcodes, you can use them directly for processing purposes.
faultstring
This is a human readable form of the error specified by the faultcode entry. This string should follow the same format as HTTP error strings. You can learn more about HTTP error strings by reading the HTTP specification at http://www. normos.org/ietf/rfc/rfc2616.txt. A good general rule to follow is to make the faultstring entry short and easy to understand.
faultactor
This element points to the source of a fault in a SOAP transaction. It contains a Uniform Resource Identifier (URI) similar to the one used for determining the destination of the header entry. According to the specification, you must include this element if the application that generates the fault message isn’t the ultimate destination for the SOAP message.
detail
You’ll use this element to hold detailed information about a fault when available. For example, this is the element that you would use to hold server-side component return values. This element is SOAP message body specific, which means you can’t use it to detail errors that occur in other areas, such as the SOAP message header. A detail entry acts as an envelope for storing detail sub-elements. Each sub-element includes a tag containing namespace information and a string containing error message information.
You can use the presence or absence of various fault message elements to your advantage when determining the cause of an error. For example, if the faultactor element appears within the fault message, it’s a good bet that the message passed between multiple servers. In some cases, this means that you can eliminate gross formatting errors from your list of things to check because several machines parsed the message without generating an error. The absence of the detail element, on the other hand, may indicate that the error happened before the component on the server processed the request message. This means there’s a problem with processing the message, rather than a problem with the server-side component’s handling of message data.
Ch
2
44
Chapter 2
SOAP In Theory
Let’s look more closely at the detail element. As stated, the detail element doesn’t stand alone—it acts as an envelope for additional information. Here’s a typical example of a detail element entry. Couldn’t Create Object 429
Notice that detail information begins with a namespace entry. In this case, a component called AddIt that appears in the MyObjects directory on the server had an error. The error message is that it couldn’t create a required object, while the associated error number is 429. The client side SOAP message parser could pass this information to the client application for additional processing. At least the user would have a better idea of what happened and could alert a network administrator to the problem. Of course, the detail element isn’t limited to a single component entry. Every component that has an error could make an entry. For that matter, you might include a single namespace entry for every error that a single component experiences. A component might actually provide several error messages that state an entire list of problems that it experienced in the processing of the SOAP message. For example, rather than return a single message at a time for faulty arguments, the component could parse all of the arguments and return a list of errors that covers all of them.
HTTP Extension Framework Let’s talk for a few seconds about mandatory HTTP Extension Frameworks. The World Wide Web Consortium (W3C) defines many aspects of the HTTP header. For example, it defines what the Content-Type declaration contains. These are standardized portions of the HTTP header that every browser and browser-like application can understand. Vendors have extended the HTTP header in ways that the W3C never anticipated. For example, there are extensions for remote procedure call (RPC) mechanisms. SOAP happens to fall into this class of extensions, so it’s possible that you’ll need to define an HTTP Extension Framework for your application. An HTTP Extension Framework describes which extensions the document contains. It also determines how the recipient should handle them. You can read more about the HTTP Extension Framework in general at http://www.w3.org/Protocols/HTTP/ietf-http-ext/ and http://www.normos.org/ietf/ rfc/rfc2774.txt. SOAP uses HTTP Extension Frameworks as a way to define how it will extend a standard HTTP message. The HTTP Extension Framework includes three additional entries. First, you’ll add an M- to the beginning of the action portion of the HTTP header. For example, POST becomes M-POST. Second, you need to add the HTTP Extension Framework identification we talked about earlier, Man: “http://schemas.xmlsoap.org/soap/envelope/”; ns=NameSpace. This tells the recipient where to find information about the HTTP Extension
Using SOAP with Your Current Code
45
Framework, and how to deal with it. The ns declaration tells which namespace to use with this HTTP Extension Framework. Finally, you need to add a namespace to the SOAPAction entry like this, NameSpace-SOAPAction. This qualifies the particular action you need to perform.
Using SOAP with Your Current Code Some developers are like kids in at a candy store. Simply because a technology is new and exciting, because it’s accepted by their peers as a good technology, all development must use that technology. SOAP isn’t the kind of technology that lends itself to this perspective. Look at SOAP as more of a value-added technology. It’s designed to overcome problems with older technologies and to simplify the programming environment in some cases. (Just how much simpler remains to be seen—vendors are currently discussing more and more added functionality, which also increases complexity.) SOAP is a good problem solving protocol, especially for distributed applications that rely on the Internet. It allows developers to solve communication and interoperability problems with DCOM and CORBA applications. The following list provides an overview of some of the applications that you may want to retrofit to use SOAP in an effort to improve reliability and reduce complexity. ■
Internet Only Applications: One situation where you may want to commit to a SOAPonly solution is when retrofitting applications that only access the Internet. The benefit of going this route is increased interoperability, reduced complexity, and fewer problems with obstacles like firewalls.
■
Applications with Partner Access: Most businesses have partners to work with today. The partners may require access to company data. Using SOAP to create the partner access portion of the application makes sense. You may, however, want to continue using DCOM or CORBA for LAN access because they provide a higher level of security and you don’t need to worry about integration.
■
Employees on the Road: Using SOAP to service employees on the road is a tossup if you already have an application in place. Since you have direct control over the employee’s machine, using DCOM or CORBA isn’t as much of an issue. In fact, you may want to use something like COM+ with Queued Components, in this case, because this solution actually provides better functionality than SOAP. However, you may find that getting through the firewall is a problem with all of these binary solutions and that SOAP is your only option.
For the purposes of this book, the term cracker will always refer to an individual that’s breaking into a system on an unauthorized basis. This includes any form of illegal activity on the system. On the other hand, a hacker will refer to someone who performs low-level system activities, including testing system security. In some cases, you need to employ the services of a good hacker to test the security measures you have in place, or suffer the consequences of a break-in. This book will use the term hacker to refer to someone who performs these legal forms of service.
Ch
2
46
Chapter 2
SOAP In Theory
■
Monitoring Programs: SOAP is a two-edge sword when it comes to monitoring programs. On one hand, monitoring programs normally deal with data in text format, so SOAP is a perfect solution. On the other hand, SOAP lacks the strong security of binary wire protocols, which means that you’re potentially opening the heart of your system to the prying eyes of crackers. You’ll need to consider the sensitivity of the data that you’re monitoring when making an application conversion choice in this situation.
■
Satellite Office Applications: Whether SOAP makes sense in this situation depends on your network setup. If each satellite office has a separate server, then you could maintain the high security provided by DCOM and CORBA locally. Use SOAP only for server updates. This solution maintains the good features of the current application implementation, while adding the flexibility of SOAP to the Internet portion of the data stream.
It’s tempting to think that you could use SOAP in place of DCOM Wire Protocol or Internet Inter-ORB Protocol (IIOP). SOAP isn’t a direct replacement for binary technologies because it lacks certain features. We’ll spend more time talking about the problems SOAP faces today in the “Current SOAP Implementation Problems” section of this chapter. These problems will likely decrease as the SOAP specification matures. However, there are certain problems that won’t go away and they affect your ability to use SOAP with your current applications. One of the biggest problems is that SOAP relies on text data transfers. Sure, text is the reason that SOAP can work with diverse operating system platforms. Using text also means that you can’t transfer objects, graphics, or other binary formatted data very well. The world does run on the content of database management systems (DBMSs), but it also needs these binary formats. We’ll discuss a possible solution in the form of SOAP attachments later in the chapter. However, even attachments will prove an error prone solution to the problem. Security is another problem with current SOAP implementations. A common solution today is to use Secure Socket Layer (SSL) technology to make SOAP more secure. (We’ll discuss SSL in the “Secure Socket Layer (SSL)” section of Chapter 3.) The problem with this solution is that it relies on both parties supporting SSL. In addition, you’re now looking at another binary implementation of what’s supposed to be a text-base protocol. You might find there’s a price to pay in the form of interoperability and compatibility when adding security to a SOAP solution. In some cases, you may actually find that DCOM is ahead in this regard because it provides in-depth security at only a slightly higher performance cost and with fewer interaction problems after you get it set up correctly. SOAP also provides limited data type support. This means that some applications you’re using today will break when you try to move them to SOAP. If your application uses complex data formats, you might actually be ahead by using one of the binary protocols or figuring out a way to reduce the complexity of the application’s data model. Theoretically, SOAP will eventually support complex data types, including objects. The reality today is that SOAP toolkits provide a bare minimum of data type support.
Discovering SOAP Services
47
Discovering SOAP Services Writing a SOAP application isn’t the end of the story. Creating a server-side listener means that you have the capability of servicing client-side requests. The client still has to find your service and ask how to use it. When using SOAP within a single organization, discovering a service is relatively easy. However, what happens when you need to service requests from third parties? Will a client be able to find your service if they don’t even know your company exists? The short answer is no. Discovery protocols provide a means for you to advertise your service in a neutral manner. What this means is that a client can find your service, request how to use it, and then make a request from you directly, all without any prior calls to your company. This is the perfect way to make business-to-business transactions work. An office supply company only needs to advertise its services on a central Web site to make those services accessible to another company. Obviously, a company could advertise services in many ways. For SOAP to work as a businessto-business transaction aid, companies have to agree on a standard advertising format. At the time of this writing, there’s a lot of agreement between small groups, which means there’s more than one way to advertise your services. Considering SOAP is a new technology, the plethora of advertising techniques is expected. Eventually, one or two of these advertising methods will win out and we’ll all use the same technique for broadcasting service information. In the mean time, I’ve chosen what I feel are the best advertising methods for this book at the time of writing.
Although we’re concentrating on three discovery services in this chapter, there are many others from which to choose. For example, the Electronic Business eXtensible Markup Language (ebXML) initiative is taking off as I write this. There isn’t enough information available yet to write much about, but there probably will be shortly. Like most discovery protocols, this one is designed to promote business-to-business communication. The two interesting bits about this protocol is that it’s supported by the United Nations and it was designed from the ground up for international use. You’ll want to keep your eyes on the main ebXML Web site at http://www.ebxml.org/ if this discovery language interests you.
Eventually you need to decide which individual or set of discovery protocols to use. The following sections look at three service description protocols: ■
Discovery of Web Services (DISCO)
■
Web Service Description Language (WSDL)
■
Universal Discovery, Description, and Integration (UDDI)
These are the three best-supported and most generic protocols at the time of writing. Given the volatile nature of SOAP at the moment, however, you’ll want to keep your eyes open for other protocols that might meet your business needs better.
Ch
2
48
Chapter 2
SOAP In Theory
Understanding Discovery of Web Services (DISCO) DISCO is a service designed to make it easier to locate and use SOAP services. This particular service is SOAP specific and a single vendor, Microsoft, currently supports it. In other words, this may not be the route to go if you need to support solutions from more than one vendor. The DISCO service relies on a special protocol named SOAP Control Language (SCL) to allow the discovery of services by remote computers. Someone interested in locating SOAP resources will contact a DISCO server, download SCL documents of interest, and then use those documents to refine service selection. Microsoft states that a developer can use SCL to discover any resource, not just Web services, but the entire technology does seem directed at public Web services.
DISCO isn’t a complete specification, so you’ll see changes in the coming months. Microsoft will likely continue working on this specification and introduce it at some point as an enabling technology for the .NET platform. You can learn more about DISCO at http://msdn.microsoft.com/xml/general/disco.asp and http://msdn.microsoft.com/library/dotnet/cpguide/ cpconenablingdiscoveryforwebservice.htm. The current document is more theory than practical information, but it does give you a good overview of what DISCO intends to accomplish.
The discovery sequence relies on an algorithm based on standard Web inquiries. A client uploads a request document to a server, which returns a discovery (.DISCO) document the client. Depending on the content of this discovery document, the client can either request a desired resource, or make additional discovery requests. After a requestor finds a desired discovery document, they’ll read the SCL statements it contains. Essentially, these statements point to schema definition language (SDL) files that contain a description of the target component. The application uses the information obtained to create an instance of the component using SOAP and work with it in the normal fashion.
Working with the Web Service Description Language (WSDL) WSDL is a joint effort by several vendors including Ariba, IBM, and Microsoft. It’s an XML-based method of describing network services. The specification describes a Web service as an endpoint—the final destination for a client request. The document your application receives contains a schema for the requested component on the remote machine. An application uses the schema to make remote requests. The vendors involved with creating WSDL designed it to provide complete flexibility and extensibility. The only implementations available right now use HTTP and SOAP to make requests. However, future WSDL implementations could use other transport and wire protocols. This Web discovery language specification also allows use of MIME, which means that you can make multiple requests in a single message.
Discovering SOAP Services
49
WSDL, like many of the technologies discussed in this chapter, is a work in progress. It’s important to keep track of changes to the specification. You can learn more about WSDL at http://msdn.microsoft.com/ xml/general/wsdl.asp.
When a client makes a request for information from a server, it receives a complex XML document in return. The document contains the name of the requested component, some links for finding documents associated with the component, and a full description of the component’s schema. The document also contains other information, but these are the important pieces for this discussion. The document links are actually namespaces. These namespaces define various document elements, just as the namespaces in a regular SOAP document do. Most of these namespaces point to locations on the server. Of course, there are some of the usual suspects, like xmlns:soap= ”http://schemas.xmlsoap.org/wsdl/soap/”, which define the location of the SOAP schema. The schema sections include elements for the various method names, expected parameters, and the parameter types. This means that you know exactly how to access the component at the outset, rather than experiencing some level of trial and error in discovering the component. The pros of this discovery technology are that it provides detailed information, great flexibility, and multiple vendor support. Because of these features, WSDL is the best technology to use for private Web service discovery. You’d use this technology with business partners and others who need access to your network, but in a situation in which you wanted to exercise some control over the people who gain access to the information. Some of the cons include a high level of complexity and the lack of a framework right now. Hopefully, the vendors involved with this effort will correct some of these failings in the near future.
Understanding Universal Discovery, Description, and Integration (UDDI) UDDI promises to become the discovery language used by everyone involved in building distributed applications—or at least it’ll grab the majority share of users. The reason is simple: UDDI abstracts the process of discovering the Web services that another company offers. It sits on top of the protocols that are already in place and builds on them. In short, it doesn’t reinvent the wheel the way many Web technologies seem to do today. In addition, vendors designed this technology for public use so that companies no longer need to know anything about each other before they discover the services that they offer. The element that separates UDDI from other Web service discovery technologies is the registry, which UDDI participants rely on. UDDI replicates the registry information across all participating servers so companies in diverse locations can find each other. Instead of having to know which company-specific server to contact for information, UDDI relies on a central repository of information. It uses a single place to hold information about the Web servers represented by all companies that use the registry. Someone who wants to know about a service you provide doesn’t even need to know that your company exists until they look for the required service. The registration is an XML file that contains these three categories of information:
Ch
2
50
Chapter 2
SOAP In Theory
■
Contact information, including company name and address.
■
Taxonomic classifications of the business, which includes a listing of the industries that the company can assist.
■
Technical listing of the Web services that a company offers (or at least those that it wants to make public).
The vendors who designed UDDI built it on the work done for WSDL. As a result, UDDI looks surprisingly similar to WSDL at a low level. Of course, UDDI also contains elements of both XML and SOAP and it relies on HTTP as a standard transport protocol.
There are many great places to find information about UDDI. One of the better places to find general information is The XML Cover Pages (http://www.oasis-open.org/ cover/uddi.html). You can find a UDDI technology overview at http://www. oasis-open.org/cover/UDDI_Technical_White_Paper.pdf. The general UDDI Web page is at http://www.uddi.org/. It provides links for the actual UDDI specification, among other things. Microsoft’s UDDI registry announcement appears at http://www.microsoft.com/presspass/press/2000/Nov00/UDDIBetaPR. asp. Even though their registry is essentially for testing purposes now, you can still see
how the technology will eventually work. Finally, the Microsoft UDDI SDK download site is at http://msdn.microsoft.com/library/techart/Progguide.htm.
UDDI isn’t a full-fledged discovery service. It’s designed to allow businesses to find each other and learn about services they offer. However, UDDI stops at this point. If you want to learn the details about a particular service, you need to contact the business in question through its server. UDDI also avoids providing business details such as pricing information, time required to provide a service, and the geographic service limits for a target company. The vendors working on UDDI are designing it to fulfill an extended role. For the purposes of this book, UDDI is a technology you can use to learn about Web services provided by other companies. However, it might help to keep UDDI’s bigger role in mind. Vendors eventually want to use it to track products and general services as well, making UDDI a type of electronic yellow pages for the entire world.
Putting Everything Together At this point, you have some idea of how SOAP works, but may not really know how everything fits together. SOAP follows a logical data flow that includes an interaction of some type between a client and a server. It relies on a text file formatted using the XML specification and includes special entries that describe the content of the SOAP message. Here are the steps in a typical exchange between client and server. 1. The client instantiates a local SOAP object. It describes the location where the SOAP message should go, serializes the content that it wants to send, and includes any special attributes.
Putting Everything Together
51
2. The local object encapsulates the SOAP message within a request HTTP message. It then sends this message to the server. 3. The listener on the server receives the message sent by the client. 4. A parsing object decodes the SOAP message. If the message is valid, it instantiates a copy of the requested object and feeds it the data sent by the client. Otherwise, the parser sends an error message back to the client. 5. At this point, if the server has everything it needs and the client doesn’t require any feedback (or is incapable of receiving it), the parser frees the object it created and waits for another request. 6. If the client requires feedback, the server generates a SOAP message of its own. This is the normal course of action when using the HTTP transport, even if the server doesn’t really have anything to say. 7. The Web server encases the SOAP message within a reply HTTP message. It sends this message back to the client. 8. The client checks the validity of the HTTP message first. If the Web server reports an error, then the client might have to send the request message again using different parameters. For example, the server might require the client to use an HTTP Extension Framework as described earlier in the chapter. 9. A client-side parser checks the content of the SOAP message. It removes any result values and sends them to the client. Since the client made the request as an object call, the parser uses standard object value return techniques. 10. The client reacts to the data returned from the server. In some cases, this may simply mean doing nothing at all for a successful transfer. In other cases, the client may display data or perform other tasks. SOAP isn’t some type of mystery protocol—the end result is much the same as binary protocols such as DCOM. The learning curve for using SOAP isn’t a matter of understanding how to perform a task in a new way, but simply using new techniques to perform that task. You can divide a problem into these five main areas: ■
Modify client applications so they call a SOAP component, rather than create a local object. This is similar to the task you’d perform for a DCOM or a CORBA application— only the actual calling syntax differs.
■
Create a client-side component that generates and interprets SOAP messages based on the input received by the client or server. In this case, SOAP is slightly more complex than older technologies like DCOM because vendors haven’t introduced all of the required tools. However, after new technologies like Microsoft’s .NET Framework appear, the task of creating the client-side component will become easier.
■
Install or create a server-side listener. In many cases, you can use the default listener that comes with a vendor supplied SOAP toolkit.
■
Create a server-side component to parse the incoming SOAP message. This component is also responsible for interacting with the data component—the one that will produce
Ch
2
52
Chapter 2
SOAP In Theory
an output. In some cases, this component might also need to perform data translation. For example, a database might support data types that SOAP doesn’t support directly. ■
Check server-side business logic and data access components to ensure they’ll interact with SOAP in a safe manner. In some cases, this means setting new policies. You may find that security concerns affect how you transfer data to the client. In other cases, you’ll need to change component logic so that the component doesn’t rely on client responses. The Internet is an unreliable media, so your application needs to allow for connectivity problems.
You’ll get a better idea of how these five tasks come into play as the book progresses. The important thing to consider now is that SOAP doesn’t have to present major challenges to your organization or make big changes in the way you perform tasks. It will make certain tasks a lot easier to perform and you’ll find that you have flexibility you didn’t have in the past. The focal point isn’t in the way you’ll perform these tasks, but in the technique you use to perform them.
Using SOAP to Move Data The whole reason to use any protocol is to move data. The applications you create move data requests from a client to a server, and then the reply from the server to the client. A server might have to call on the services of another server to answer client requests. Two servers might exchange data in an effort to optimize operations or perform tasks like user validation. In the end, it all amounts to moving data from one place to another, shuffling it around a bit to produce a result, and storing it in yet another place. Therefore, a true measure of a protocol is how well it moves data around. Not only do you need to consider the speed at which the protocol moves the data, but also the accuracy, efficiency, and reliability with which the protocol moves the data. After all, no one cares how fast you can move data if you can’t move the data accurately (without any errors). A developer is also concerned about the ease of moving the data, as well as the flexibility of the data protocol. So, how does SOAP stack up? You need to ask this question before committing resources to a solution that won’t work. The first section that follows looks at some of the current implementation problems with SOAP. These problems won’t stay around forever because SOAP is an evolving standard. In addition, even though the problems exist today, vendors are already talking about solutions they plan to implement in the future. Consider this first section a snapshot of SOAP today. Make sure you spend some time looking at tools that become available by the time you read this. The computer industry is also working on other solutions that mimic the capabilities of SOAP. The two most popular solutions are XML-RPC and XML Protocol (XMLP). XMLRPC actually appeared on the scene before SOAP and is currently more mature. It allows you to do some things that SOAP won’t currently allow, but that flexibility has some costs that you need to know about. XMLP is a new protocol. At the time of this writing, XMLP
Using SOAP to Move Data
53
is barely out of the proposal stage. We’ll talk about both of these alternatives so that you have the full story before embarking on your next development project.
In the confusing world of computer acronyms, the W3C committee in charge of XML Protocol recently changed the acronym from XP to XMLP. Whether this is in deference to Microsoft Windows XP or not isn’t clear. However, for those of you who were already familiar with the XP acronym, be sure to look for XML Protocol under the XMLP acronym from now on. Ch
Current SOAP Implementation Problems SOAP is a new technology and the specification is still under review as I write this. Microsoft and other vendors are creating new toolkits that permit developers to create SOAP applications with a little more ease than writing everything by hand. Of course, with the SOAP specification and associated toolkits in a state of flux as they are now, it’s only reasonable to assume that you’ll run into some problems with the various vendor offerings. Some SOAP problems result from an immature specification. One issue that vendors are working on is the idea of a unique identifier for SOAP. For example, when you work with DCOM, you can access a component using a globally unique identifier (GUID). This means that if two components have the same name, you can still identify them using their respective GUIDs. Many developers complain that SOAP lacks resource management tools. For example, SOAP doesn’t include the concept of connection pooling. The lack of resource management means that SOAP has to create resources from scratch for every user request, making SOAP inefficient when compared to other new technologies such as COM+. SOAP toolkits also have different ways of performing the same task. Let’s look at a concrete example of the problems with the current implementations. I used two toolkits and fed them the same SOAP request. This first listing is what the Microsoft SOAP Toolkit Version 2 output. 1 2 3
Notice that this output is relatively uncluttered and looks very much like the simple example we talked about earlier. While this output works with Microsoft products, it doesn’t necessarily provide full compatibility with the SOAP specification. We’ll talk about how this
2
54
Chapter 2
SOAP In Theory
sample misses the specification and what Microsoft plans to do about it as the section progresses. The second listing is from the 4S4C (Simons Soap Server Services for COM). 3 1 2
While 4S4C produces complex looking output, it also provides more information than the Microsoft SOAP Toolkit. In many cases, the additional output can mean the difference between an application that works and one that doesn’t. Of course, the 4S4C toolkit has problems of its own. The biggest problem is that the XML tag doesn’t contain the encoding attribute, which confuses the parsing mechanisms for some languages. Since this problem is relatively minor and only appears in one place at the beginning of the message, you could theoretically hand code around it for now. Hopefully, the 4S4C author will fix this problem with the next release of the product. You need to consider several SOAP specific issues when looking at the output of these two toolkits. According to the SOAP specification, the placement of the result in the Microsoft message is incorrect—it should appear as the first element in the message, not the last as shown. The current ordering is more like the output of COM, than the output of SOAP. It also means that SOAP message decoding is different for the Microsoft SOAP Toolkit output than for 4S4C. This ordering issue can cause problems with clients who are expecting to receive elements in a specific order. Microsoft says they’ll fix this problem and they may have done so by the time you read this. Still, element ordering is an important issue to consider.
4S4C is just one of many SOAP toolkits on the market that allow you to create SOAP application using Visual C++. We’ll cover more of these toolkits as the book progresses. Look at http://www.pocketsoap.com/ 4s4c/ if you want to learn more about 4S4C. 4S4C is an easy to use toolkit that includes many features that the Microsoft offering doesn’t. For example, this toolkit offers complex variable handling and custom interface support. It also provides superior support for Visual C++ users—at least as of this writing.
Another problem with the Microsoft SOAP Toolkit may not go away very soon. Notice that the Microsoft Toolkit doesn’t provide proper qualification attributes showing the type of the input values (First and Second) or the return type. This particular problem causes the
Using SOAP to Move Data
55
output of this toolkit to fail when used with an Apache server. In this case, Apache promises to fix their server offering since the SOAP specification doesn’t require type attributes for variables. However, using type attributes is still a good idea since it avoids ambiguity problems and makes it easier to decode the SOAP message. The price, of course, is increased SOAP message size, which may affect performance. The Microsoft SOAP Toolkit also uses “Result” as a return value name, no matter what the component’s author may have requested. While the SOAP specification isn’t very clear about this problem, the common way of handling a return value without a name is to use “defaultname” as shown in the 4S4C example. This particular problem only exacerbates the problem of element ordering since it’s impossible to specify an element by name if there isn’t a common naming method. Unfortunately, the SOAP committee will need to work out this problem before vendors will know how to fix the problem in their toolkits. The envelope tag in the Microsoft SOAP Toolkit example is missing two important attributes, which include the XML Schema Definition (XSD) location and the XML Schema for Interfaces (XSI). Parsers need both of these attributes if you want them to define variable types throughout the SOAP document. While the SOAP specification doesn’t say that a SOAP implementation must provide these attributes, it does show them throughout the entire document including the example code. Again, Microsoft has promised to fix this problem in a future build of its toolkit. The important consideration is that you’ll want to check for these entries if you’re having a problem with data typing within a SOAP document.
Like everything else with SOAP, you need to define the location of the type attribute specifications or schemas. You typically define two namespaces to make this work. The first, xmlns:xsd=”http://www.w3.org/ 1999/XMLSchema”, is the XML Schema Definition. The second, xmlns:xsi=”http://www.w3.org/1999/ XMLSchema-instance”, is the XML Schema for Instances. You can find a complete write-up of both entries at http://www.w3.org/TR/xmlschema-0/. This document also tells you which data types XML supports natively. Note that your SOAP implementation may not support all of these data types right now, but probably will in the future. You’ll also want to learn about the other two parts of the XML schema: structures at http://www.w3.org/TR/xmlschema-1/ and data types at http://www.w3.org/TR/xmlschema-2/.
By this time, you may be concerned that you’re going to run into compatibility problems when working with SOAP. It’s important to remember that SOAP is still in its infancy. The SOAP committee will eventually develop a robust specification. Vendors will eventually implement that specification in a way that allows SOAP projects to interact. Of course, there’s always the specter of vendor bragging rights and marketing with which to contend. Vendors continue adding extensions to standardized products in an effort to differentiate their product from someone else’s. The main purpose of this section of the chapter isn’t to present you with a hit list of problems in SOAP implementations today—it’s to point out problems that you should look for in the future. Creating a well-formed SOAP document might seem easy, but, as shown here, is sometimes harder than it looks.
Ch
2
56
Chapter 2
SOAP In Theory
SOAP and XML-RPC XML-RPC was one of the first text-based remote procedure call (RPC) technologies on the market designed for Internet use. If you spend some time looking at this protocol, you’ll notice that it bears some resemblance to SOAP, but the resemblance isn’t strong. XMLRPC relies on HTML type organization within an XML framework.
There are several places on the Web to find out more about XML-RPC. The XML-RPC home page is at http://www.xml-rpc.com/. You can also find a copy of the current specification at http://www.xmlrpc.com/stories/storyReader$7. A good overview and simple explanation of XML-RPC appears at http://davenet. userland.com/1998/07/14/xmlRpcForNewbies/. You’ll also find an interesting XML technology comparison chart at http://www.w3.org/2000/03/29-XMLprotocol-matrix.
One of the big differences between XML-RPC and SOAP is the lack of namespaces within XML-RPC. Everything needed to define a component and its interface appears within the document. This organization is simpler than SOAP at the outset, but the lack of flexibility can create problems. If XML-RPC doesn’t know how to handle a new variable type, you can’t add support for that variable type. SOAP is far more extensible than XML-RPC. Interestingly enough, XML-RPC does provide support for complex data type like structures and arrays. It’s also interesting to note that XML-RPC supports an encoded binary data type. If you want to get binary data type support with SOAP, you have to use an attachment. XML-RPC relies on HTTP for transport purposes like SOAP. It also uses a similar, though simplistic, fault mechanism. Again, the extensibility of SOAP provides you with more error reporting options. You can convert XLM-RPC messages to SOAP messages. Since both are text messages, there aren’t any binary compatibility problems to consider. I’m not going to provide you with a blow-by-blow description of the process in this chapter, but you can find a write-up on the topic at http://lwprotocols.org/xmlrpc2soap.html.
SOAP and XML Protocol (XMLP) This is the new kid on the block. At the time of writing, the vendors supporting this protocol had barely begun their work. I wanted to include this protocol because it looks like it will become an important tool for most developer toolboxes. Of course, you’ll want to wait until the XMLP protocol becomes a standard before you spend much time developing for it.
You can find out more about XMLP at http://www.w3.org/2000/xp/. Another interesting place to find general information about XMLP is http://www.oasisopen.org/cover/xp-prot.html.
Understanding SOAP Attachments
57
The element of XMLP that I find interesting is that it’s a peer protocol technology. Peer-topeer communication has gained interest again because of the way companies like Napster are using it. XMLP is also one of the few RPCs to provide more than one transport protocol. You can use XMLP on both HTTP and SMTP. Even one additional protocol tends to make a protocol more flexible. Of course, most people will probably use XMLP on HTTP for now. You may not use XMLP for your next project—it’s still a new protocol in the midst of creation. However, XMLP shows a great deal of promise and you should keep your eye on it for the future. The model on which XMLP is based should provide maximum flexibility and extensibility (two catch phrases for any new technology today).
Understanding SOAP Attachments Some people tend to forget that HTTP is a transport mechanism only—HTTP doesn’t define the content of the message that it carries. That’s why we can use HTTP to transport SOAP messages from one place to another. Likewise, SOAP, which vendors have based on XML, only defines one potential type of message within the HTTP envelope. Just as bill collectors commonly place advertising in your billing statement, vendors can place other materials within the HTTP envelope that carries a SOAP message. In fact, that’s precisely what this section of the chapter talks about. The secret to SOAP attachments is using the Multipurpose Internet Mail Extensions (MIME). This is the same magic used to send attachments in mail. Anyone who has browsed their e-mail in source code mode knows how a MIME message is put together.
Unlike many of the extensions to a specification that you’ll run into, there’s a specification for SOAP attachments. You can read about it at http://msdn.microsoft.com /xml/general/soapattachspec.asp. At the time of writing, this particular part of SOAP isn’t part of a public specification process. However, given the way Microsoft and other vendors are handling SOAP, there’s little reason to doubt that SOAP attachments will become a public specification and approved by a standards group. In addition, this technique relies on existing technology, so there’s little reason to think that a standards group won’t approve the specification.
Let’s dissect a MIME message for the sake of discussion. A mime message consists of at least three entries. 1. A MIME header 2. A MIME part description 3. Content The three pieces will always begin in this order. Of course, the reason to use MIME is that you can create a message with multiple parts. Parts 2 and 3 of the message can repeat
Ch
2
58
Chapter 2
SOAP In Theory
several times—Part 1 only appears once at the beginning of the message. Here’s an example of Part 1 of the message. MIME-Version: 1.0 Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml; start=”” Content-Description: This is a sample message header.
The Content-Type entry in the source code above normally appears on one line. I’ve shown it on two lines to accommodate the formatting requirements of the book.
Note that this part description contains the MIME version number, the type of content, a starting location, and a description of the content. The starting point is especially important because it’s the first thing the recipient will see. The “start=” attribute always points to the SOAP message. Here’s an example of Part 2 of the message. --MIME_boundary Content-Type: text/xml; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-ID:
As you can see, this part uses the keyword “MIME boundary” to show that it’s another part of the message. There are also content type, charset, and encoding entries, much like the HTTP header we looked at earlier in the chapter. The Content-ID entry defines the name of this part of the message. The first part of the Content-ID entry defines the name of the file, while the second part contains the domain that generated the content. The third part of a MIME message sequence is the data. We’ve already looked at SOAP messages several times in the chapter, so I won’t repeat that information here. There isn’t any difference in the SOAP format for a MIME message—it’s the same as a standalone message. Of course, you can just as easily include any binary file as part of the MIME message. Most people will associate MIME with email messages. However, MIME, like SOAP, can rely on any of several transport protocols including HTTP. The Microsoft documentation shows you how to bind MIME with HTTP. In this case, you replace Part 1 of the MIME message sequence with a special type of HTTP header. The header looks similar to the other HTTP headers discussed in the chapter so far. The main difference is the Content-Type entry looks like the one used for a MIME message, rather than the typical HTTP header entry.
Case Study Sometimes a SOAP solution takes the form of cooperative business. Take, for example, one company that we’ll call MyBuy Corporation that decided to set up a merge-in-transit system. A merge-in-transit system is where two companies cooperate to fulfill customer orders in the most efficient manner possible. One company produces goods, while another company assembles goods from various factory outlets for shipment to customers. You’ll see how this works as the case study progresses.
Case Study
59
In this case, MyBuy and MyMail (a fictitious name for a company like FedEx or UPS) decided to team up. MyBuy receives orders from customers for a variety of their products. Since MyBuy is nationwide, they need warehouses in several states to provide timely delivery of their products. The problem is that these warehouses are costly to operate and maintain. Not only that, but MyBuy also pays MyMail to pick up orders at the closest warehouse for delivery to customers—another expense. Figure 2.4 shows the old company setup—you can see how inefficient it is. Figure 2.4 The original MyMail setup reflects what most companies use today, which is inefficient and costly.
MyBuy Factories
Ch
2 MyBuy Factory Warehouses
MyBuy Assembly Warehouse
MyBuy Local Warehouses
Customers
Someone at MyBuy decided that the company could eliminate 3% to 5% of its business expenditures by getting rid of all of those local warehouses. Using a factory direct approach would mean that MyBuy wouldn’t need to keep track of inventory in each warehouse in hope of fulfilling orders from existing stock. This is where MyMail comes into play. MyMail has to store MyBuy’s products in their warehouses anyway before shipment. In addition, MyMail has to maintain local warehouses to provide service to all of their customers. When MyBuy approached MyMail about their problem, the two companies decided to work together to solve it. MyBuy eliminated their local warehouses and began to store their products at the extensive MyMail local warehouses. MyBuy simply sent products from their factories to the MyMail assembly warehouses where
60
Chapter 2
SOAP In Theory
MyMail employees put orders together and ship them to customers. Figure 2.5 shows what this arrangement looks like. Figure 2.5 The new MyMail setup saves money by reducing the number of distribution points.
MyBuy Factories
MyMail Assembly Warehouse
SOAP Orders
MyMail Local Warehouses
Customers
The only problem with this arrangement, of course, is that the two companies need some way of passing order information to each other. That’s where SOAP comes into play. When MyBuy receives a customer order, an employee keys the information into a local database. The order information travels from the MyBuy company site to the MyMail Web site using the same business-to-business transaction Web site that other MyMail customers use. The transaction wouldn’t be possible without SOAP because the two companies use different computer setups and their databases don’t accept data in the same way. A translation has to take place—one that isn’t easy to perform using older technologies like DCOM. (Even if MyBuy and MyMail could agree on a single setup, MyMail has other customers to support.) MyMail actually gets the order shipped a day earlier than normal because the MyBuy warehouse doesn’t have to assemble the order and then call MyMail to pick it up. As a result, the customer normally receives their order earlier. Not only is MyBuy more efficient, but customer satisfaction increases as well. However, there’s an added bonus in this setup. When a customer uses the MyBuy Web site, the order goes directly from MyBuy to MyMail without any intervention. This means that an order made by a customer at 1 a.m. could theoretically arrive the same day. The MyMail warehouse that’s closest to that customer will receive the order and could fulfill it if all of the needed parts from the MyBuy factories are on hand.
CHAPTER
3
An Overview of Security Issues for SOAP In this chapter Introduction
62
Understanding SOAP Privacy and Security Issues Security Standards You Should Know About User Identification Issues
75
Where Do You Go from Here? Case Study
78
77
67
63
62
Chapter 3
An Overview of Security Issues for Soap
Introduction Security figures prominently in the headlines today. It seems that security is always an issue for someone. The story is always the same. A cracker breaks in, causes some problems, an administrator in the target company catches them, and the company promises to do something about security in the future. Of course, fixing the problem after someone breaks into your system is like putting your seat belt on after an accident. Yes, you can prevent future damage, but your system has already suffered loss.
For the purposes of this book, the term cracker always refers to an individual that’s breaking into a system on an unauthorized basis. This includes any form of illegal activity on the system. On the other hand, a hacker refers to someone who performs low-level system activities, including testing system security. In some cases, you need to employ the services of a good hacker to test the security measures you have in place, or suffer the consequences of a break-in. This book uses the term hacker to refer to someone who performs these legal forms of service.
This chapter is going to provide a basic overview of common security considerations for SOAP applications. We’ll talk about security quite a bit as the book progresses and we get into the example programs. Consider this chapter as the introduction to more material. It talks about common elements, places to get additional help, and some of the tasks you need to perform for each application. Every developer should remember two things as part of the development process. First, security is an ongoing process. As soon as you fix one set of problems, crackers will find yet another way to break into your system. Therefore, it’s important to make your applications flexible. Make sure you can add new security features as needed. Second, no matter how well you designed your application, someone will figure out a way to break the security measures you included with it. With this in mind, you should consider adding a level of security monitoring with the application. We’ll talk about this particular issue as the book progresses because security monitoring is a customized part of the programming process. The first section of this chapter, “Understanding SOAP Privacy and Security Issues” discusses how working with a public network is different from working with applications on the LAN. One important issue is privacy. Not only do you need to keep company data safe from prying eyes, but you also need to provide customer identities and other information that usually aren’t exposed on a LAN. Given the text-based nature of SOAP transfers, this issue becomes even more important.
Microsoft and other vendors realize that security is an increasing concern because of the number of cracker attacks hitting both their Web sites and those of their customers. Some vendors now talk about levels of security service and the need to differentiate these levels by vendor. For example, Microsoft knows that its current security policies
Understanding SOAP Privacy and Security Issues
63
leave something to be desired in a world of distributed applications, so they’re implementing new security strategies to overcome some of these limitations. You can read about Microsoft’s new third-wave security strategy at http://www.microsoft.com /technet/security/ thrdwave.asp.
The second section, “Security Standards You Should Know About,” discusses the important standards in use today. Sometimes, a developer will limit the sources of information used to make security decisions. Using standards-based security ensures you get the best security available that will work with solutions used by other companies. In addition, knowing the standards is like placing additional tools in your toolbox. You need this information to do a great job putting your next application together. The HTTP Authentication Framework, Secure/Multipurpose Internet Mail Extensions (S/MIME), and Secure Socket Layer (SSL) sections explore ways to keep your SOAP application secure. Each of these technologies will add to your ability to keep both company and user data safe. You’ll need to use a security technology that matches your data transfer strategy. For example, S/MIME works best when you need to use SOAP attachments. (We discussed SOAP Attachments in the “Understanding SOAP Attachments” section of Chapter 2.) The “User Identification Issues” section talks about methods for determining user identify. You don’t want just anyone to access your Web site. Unfortunately, given user privacy issues, you might have a hard time determining that a user is who he or she says they are. Fortunately, there are some ways around this issue, especially if security is a big concern. The “Where do You Go from Here?” section provides some pointers on what you can do to improve the security of your SOAP application in general. Remember that security is an ongoing process and your application needs to change to keep up with security requirements. The final section of the chapter is a security-related case study. We’ll look at the theory we’ve talked about in this chapter in action at a company just like yours. I’ll include some tips on how this company could have avoided some of the problems that it encountered. This section also discusses what you can do when preventative care won’t do the job and you need to resort to monitoring to catch crackers as they enter your network.
Understanding SOAP Privacy and Security Issues Anyone making even the smallest survey of trade press and magazine articles today will notice a definite increase in security and privacy articles. Corporations are tightening their security perimeters by purchasing an increasing number of security products. Security is a big money issue today; there isn’t any doubt about it. In addition, privacy issues are making their rounds through government agencies and it isn’t clear how the government will respond to growing pressure for individual privacy protection. Of course, security vendor hype has something to do with increased panic in corporations. I used to get an occasional e-mail detailing some new virus. Today I get a new virus notice
Ch
3
64
Chapter 3
An Overview of Security Issues for Soap
almost every day. Eventually I may have to filter my messages to stem the tide of messages filled with woe and gloom. I take viruses seriously and use preventative software, but sometimes the hype factor verges on the ridiculous for viruses that haven’t even become a threat. Frontpage stories of identity stealing and other privacy issues don’t help. People are concerned that someone will steal their identity with the very next purchase they make online. The government is weighing the right someone has to privacy against their desire to purchase items online. In the meantime, a few vendors are playing on people’s fear by selling them privacy protection policies. If someone steals their identity, the company promises to help them recover it. Certainly, hype is a problem in the computer industry. If we weren’t suffering through security hype attacks, then vendors hawking other wares would hit us with some type of hype. Hype aside, security and privacy are serious issues. The following sections address important security and privacy issues in a way that helps you cut through the hype and concentrate on getting that next application completed.
What Are the Issues? One of your first tasks in creating a distributed application is figuring out where hype ends and reality begins. On one hand, you can’t afford to ignore security and privacy issues because governments are creating laws that say you must care or spend time in jail. On the other hand, you don’t want to spend all your time in a dither, wringing your hands and wondering when the government will break down your door. Distributed computing opens a new world of security problems, but you don’t have to be overwhelmed by them. The following list includes reasons you need to care about security beyond desktop application requirements when building distributed applications. ■
Crackers are becoming more persistent. The wealth of virus kits just makes their job easier. Your data is definitely a prime target for corruption by cracker attack. For that matter, they might just decide to steal your data and sell it to the highest bidder.
■
Using a text-based data format makes your application work with other vendor’s systems better, even if those systems don’t rely on the same hardware or operating system. Of course, crackers can easily read and modify text data without leaving any fingerprints. Good security relies on placing roadblocks against intrusion and then detecting times when the roadblocks don’t work. Text-based protocols make detection impossible and protection improbable.
■
Governments all over the world are creating laws that address both corporate and personal privacy. You’re legally responsible for protecting user data from prying eyes—a difficult task even in the best situation.
■
Leaving ports open on your system allows partners, customers, and employees on the road access to your system. These open ports are also an invitation to cracker attack.
I could make this list much longer. The point is that distributed applications face many threats that desktop applications will never face. When you work with a desktop application, there’s a certain level of physical security involved that you don’t receive when working online. In addition, the threats you see today will become old and mundane while new threats
Understanding SOAP Privacy and Security Issues
65
take their place. Security constantly changes when working with distributed applications because new vulnerabilities are exposed and as you learn new techniques to provide security, crackers are learning new ways to break through. You need a three-part plan to protect your data that includes security monitoring, improved security measures, and training. To give you an idea of just how much the security scenery changes, consider the problems with the Berkeley Internet Name Domain (BIND) protocol. This protocol allows Domain Name Service (DNS) servers to translate human readable Web site names to IP addresses. Who would have thought that crackers would use a hole in this protocol to bring Microsoft’s servers down. The crackers weren’t, in fact, aware of the problem until a Microsoft technician made a configuration mistake that took Microsoft’s Web site offline. Now that crackers know about the security vulnerability, Web site operations will need to upgrade to a clean version of BIND or suffer the consequences.
User Privacy Issues User privacy is going to become a major headache for most developers because there are so many government factions getting into the picture. The one thing that you should count on doing is making any user data exchanges airtight. You need to ensure databases are secure and build a buffer between them and the Internet, if possible. Data aging is another important issue. Any application you build should provide some sort of user-information aging so it eliminates users who don’t visit on a regular basis. However, no matter how many precautions you take, some cracker is going to break in if he or she is determined to do so. I read articles in the trade press about application failures and information releases due to unknown security holes on a regular basis and I imagine that you do too. It’s unlikely that the people in the article could have foreseen the security breach or done anything more to protect against it. In fact, these articles seldom lambaste the people in question for the security problem or the failure of their software. What the article usually points out is that the company denied the problem and failed to notify users of the security breach. Modern applications should include some form of security monitoring. I’m not talking about the big brother type of monitoring, but the type that alerts a network administrator to irregularities in data processing. Something as simple as reporting a user who continually tries multiple times to access the Web site, only to fail, can help locate crackers trying to break into your system. The application should maintain logs and provide some form of statistical analysis so that the network administrator can look for trends. Of course, you have to do all of this within the guidelines of the new laws that governments pass to protect the identity of users who access your site.
At least two vendors are working on new standards that will help protect user privacy, as well as the content of a SOAP message. The first is the Security Service Markup Language (S2ML). You can find a good general page for this standard at http://xml. coverpages.org/s2ml.html. The second is AuthXML. You can find a good general page for this standard at http://xml.coverpages.org/authxml.html. In both cases, the standards allow a vendor to add attributes to an XML (including SOAP) message that describe access rights for each element. These standards also provide for
Ch
3
66
Chapter 3
An Overview of Security Issues for Soap
authentication and other important privacy issues. (Note that the standards groups for both of these standards appear in Table 1.1.)
User privacy often treads on the verge of the bizarre for the developer. One of the privacy issues that developers and the legal minds of government need to consider is how to treat user information when used in an official capacity. For example, if you’re an individual shopping at Amazon.com, then the rules for sharing your information are pretty clear—the vendor has to keep your information safe unless you grant permission to give it away. However, what happens when a corporate office makes a business purchase over the Internet? In other words, it’s really a company making the purchase; the individual is simply the intermediary between the two companies. In addition, the individual supplies their company identity, not his or her personal one. Does the law protect a person’s corporate identity in the same way as his or her private identity? The answer is unclear right now, but it’s an important question to consider. As a developer, you may find yourself writing special code for those business-to-business transactions you coded using standard client/server techniques in the past. Another privacy issue is the matter of opting into a mailing list versus opting out of it. Many vendors would like to use the opting out scheme, which means they could assume everyone wants to be on their mailing list unless they specifically say otherwise. Vendors state that they’ll maintain larger lists using the opt out option because fewer people are likely to ask to leave the list than take the time to opt into a mailing list. In other words, the vendors will be able to sell their list to other companies at a higher price because it will contain more names. Privacy groups, on the other hand, want to adopt the opt in approach because it offers greater individual protection. Again, this is an important issue to consider because it affects how you write your application
Data Security Issues You’ve likely read countless tomes on maintaining the security of applications in general. For example, I probably don’t need to tell you that you should password protect your application or that you should only assign the level of security a user actually requires. Windows 2000 even adds the new role-based security model in which you can assign security to individual components and even methods within an application. All of these measures work well within an internal application because you don’t have to worry about outside influences nearly as much as you do about disgruntled employees. Distributed applications require additional levels of security that internal applications don’t require. For example, anyone on the Internet can intercept the data flowing between your site and the site of your customer. You need to encrypt the data in some way to prevent prying eyes from looking at it. Unfortunately, the technologies required to encrypt the data also cause problems with firewalls. DCOM and CORBA both provide encrypting technology that’s hard to break because they encrypt the entire message (sometimes at several levels).
Security Standards You Should Know About
67
Unfortunately, neither of these protocols will do you much good because they don’t work very well on the Internet. We’ll see later in the chapter that some companies are working on ways to encrypt SOAP messages, but this technology is in its infancy as I write this. Now, consider another problem. You encrypt the data using a well-known scheme. Unfortunately, the other party using your application doesn’t have the same platform. As a result, they don’t have access to your well-known encryption scheme and can’t decrypt your message. Some vendors are working on a common encryption scheme, but these common schemes usually end up working on a limited number of platforms. As a developer, you normally need to provide access to several different encryption schemes, so the endpoints of your application can decide on an encryption scheme that will work for everyone. Disparate operating systems and hardware also present another problem—data translation. I’m unaware of any method for translating data from one format to another while in an encrypted state. Decrypting the data to translate it may present a problem if done incorrectly. You’ll want to ensure your application performs any required translations in a safe place, which means behind a firewall and on a secure server. Of course, the very act of translating the data opens the possibility of introducing data errors. SOAP partially mitigates this problem because it uses a text format that servers can easily parse. A final problem to consider is one of technological differences between platforms. Consider the current problem with wireless technology. Vendors admit that the current technology leaves data unprotected during the transition from the world of wireless communication to the LAN. These vendors are working on a solution, but again, at the time of this writing none exists. The time the data remains unprotected is short, but crackers have taken advantage of less serious openings in the past.
Security Standards You Should Know About Security standards are an important part of your company’s safety net. Most administrators know there are holes in the security net for their company. These holes come from a variety of sources, including the operating system, ports open to the Internet, and off-the-shelf applications that they rely on to conduct business. All of those problems are real, but you don’t have to face them alone. Standards groups are working even as you read this to come up with methods for protecting data. All you need to do is learn the methods that these groups come up with for managing security on your network. The advantages to using standards-based security are twofold. First, you won’t have to reinvent the wheel and create everything from scratch. Second, your security methods will mesh with those used by other sites, reducing the user learning curve and making it possible for you to use tools developed for other programmers. (There’s a rather dubious third advantage that you’ll have some idea of where security breaches might appear and can build components to monitor for them until the associated standards committee defines a fix for the problem.) Let’s look at some of the existing and proposed security standards that affect the Internet today. Table 3.1 contains a list of what I consider essential security standards. Notice that
Ch
3
68
Chapter 3
An Overview of Security Issues for Soap
the Internet Engineering Task Force (IETF), http://www.ietf.org/, and World Wide Web Consortium (W3C), http://www.w3.org/, play a major role in devising and implementing Internet standards (with the help of member vendors, of course). You can find most of the IETF RFC documents at http://www.rfc-editor.org/. You can also find lists of the current IETF working groups at http://www.ietf.cnri.reston.va.us/html.charters/. These working groups help create the standards used on the Internet. The W3C usually has a list of their standards on their main Web page, so you can just click a single link and get to where you need to go.
Table 3.1
Internet Security Standards and Specifications
Standard
Description
AuthXML
This security technology is in the specification stage as I write this— the vendor does intend to submit it to a standards committee (W3C most likely). AuthXML is an authentication and authorization specification that would replace localized forms of these same technologies. It’s based on XML and relies on digital signatures to perform its work. Since this isn’t an encoding specifica tion, any data you transmit is still open for interception by third parties. You can find out more about AuthXML at http://www.authxml.org/.
Distributed Authentication Security Service (DASS)
DASS defines an experimental method for providing authentication services on the IETF RFC1507 Internet. The goal of authentication, in this case, is to verify who sent a message or request. Current password schemes have a number of problems that DASS tries to solve. For example, there’s no way to verify that the sender of a password isn’t impersonating someone else. DASS provides authentication services in a distributed envi ronment. Distributed environments present special challenges because users don’t log onto just one machine—they could conceivably log onto every machine on the network. You can find out more about this standard at http://www. wu-wien.ac.at:8082/rfc/rfc1507.hyx/$$root.
Digital Signatures
This is a standard originated by Initiative (DSI) W3C to overcome some limitations of channel-level security. For example, channel-level security can’t deal with documents and application semantics. A channel also doesn’t use the Internet’s bandwidth very efficiently because all the processing takes place on the Internet rather than at the client or server. DSI defines a mathematical method for transferring signatures— essentially a unique representation of a specific individual or company. Find out more at http://www.w3.org/ DSig/Overview.html. It’s interesting to note that there’s an XML version of this technology underway. Read about this new technology at http://www.w3.org/Signature/.
eXtensible Rights Markup Language (XRML)
A new ContentGuard specification that defines how someone can use content provided by your company. It describes the rights, fees, and conditions of use for content. XRML also allows a vendor to define trusted systems that can use a product for testing and evaluation purposes. This technology relies on a trusted server to determine if someone can access content and what rights they have when they do. You can learn more about this specification at http://www.xrml.org/.
Security Standards You Should Know About
Table 3.1
69
Continued
Standard
Description
Generic Security Service Application Program Interface (GSS-API) IETF RFC1508
This specification defines methods for supporting security service calls in a generic manner. Using a generic interface allows greater source code portability on a wider range of platforms. IETF doesn’t see this specification as the end of the process, but rather the starting point for other, more specific, standards in the future. However, knowing that this standard exists can help you find the thread of commonality between various security implementation methods. You can find out more about this standard at http://www.wu-wien.ac. at:8082/rfc/rfc1508.hyx/$$root.
HTTP Authentication Framework RFC2617
One of the problems with HTTP 1.0 authentication is that endpoints send passwords in the clear, making it easy for others to grab the passwords and circumvent security. The HTTP Authentication Framework provides a means for encrypting authentication information—improving overall security. You can find out more about this standard at http://www.faqs.org/rfcs/rfc2617.html.
Internet Protocol Security Protocol (IPSec)
This specification addresses issues of IP client security, such as the inability to encrypt data at the protocol level. It includes a wide range of specifications that will ultimately result in more secure IP transactions. You can find out more at http://www.ietf.cnri. reston.va.us/html.charters/ipsec-charter.html.
Private Communication Technology (PCT)
Like SSL, the IETF designed PCT to provide a secure method of communication between a client and server at the low-protocol level. It can work with any high-level protocol such as HTTP, FTP, or TELNET. You can find out more at http://www.graphcomp. com/info/specs/ms/pct.htm.
Privacy Enhanced Mail Part I (PEM1) Message Encryption and Authentication Procedures IETF RFC1421
This specification outlines a procedure for encrypting mail in a way that protects the user’s mail but the process of decrypting it is invisible. This includes the use of keys and other forms of certificate management. Some of the specification is based on the CCITT X.400 specification—especially in the areas of Mail Handling Service (MHS) and Mail Transfer System (MTS). Related standards include: Privacy Enhanced Mail Part II (PEM2) Certificate-Based Key Management (IETF RFC1422), Privacy Enhanced Mail Part III (PEM3) Algorithms, Modes, and Identifiers (IETF RFC1423), and Privacy Enhanced Mail Part IV (PEM4) Key Certification and Related Services (IETF RFC1424). Find out more about this standard at http://www.cs.ucl.ac.uk/research/ice-tel/ osisec/documentation/ or http://www.si.hhs.nl/~henks/ comp/crypt.html (which includes the associated standards).
Secure Multipurpose Internet Mail Extensions (S/MIME)
This specification defines a method for different developers to create message transfer agents (MTAs) that use compatible encryption technology. Essentially, this means that if someone sends you a message using a Lotus product, you can read it with your Banyan product. S/MIME is based on the popular Internet MIME standard (RFC1521). You can find out about standard MIME at http://www. oac.uci.edu/indiv/ehood/MIME/. There’s a whole list of S/MIME specific resources at http://www.rsasecurity.com/standards/ smime/resources.html. The S/MIME working group resides at http://www.ietf.org/html.charters/smime-charter.html.
Ch
3
70
Chapter 3
An Overview of Security Issues for Soap
Table 3.1
Continued
Standard
Description
Secure/Wide Area Network
The main goal of S/WAN is to allow (S/WAN) companies to mix-andmatch the best firewall and TCP/IP stack products to build Internetbased virtual private networks (VPNs). Current solutions usually lock the user into a single source for both products. S/WAN is no longer a specification or even a work in progress, but it has spawned other efforts along the same line. Find out more about S/WAN and some of the efforts associated with it at http://www.rsasecurity.com/ rsalabs/faq/5-1-3.html.
Security Services Markup (S2ML)
This is a new data security standard under consideration Language by Organization for the Advancement of Structured Information Standards (OASIS). It’s based on XML and designed to provide a common security language for all programming platforms. The main consideration for this standard is that it’s designed to move RPC packets such as SOAP messages across the Internet in a secure manner. You can find out more about this pro posed security standard at http:// www.s2ml.org/.
Secure Hypertext Transfer This is the current encrypted data transfer technology used by Open Protocol (SHTTP) RFC2660 Marketplace Server, which is similar in functionality to SSL. The big difference is that this method only works with HTTP. The IETF formed the Web Transaction Security (WTS) group to look at specifications like this one. You can find out more about this standard at http://www.ietf.org/html.charters/wts-charter.html. Secure Sockets Layer (SSL)
This is a W3C standard for transferring encrypted information from the client to the server at the protocol layer. Sockets allow low-level encryption of transactions in higher-level protocols such as HTTP, NNTP, and FTP. The standard also specifies methods for server and client authentication (although client-site authentication is optional). SSL is a popular specification that you can find in many places including http://webopedia.internet.com/TERM/S/SSL.html (check the links at the bottom of the page) and http://www. ncsa.uiuc.edu/InformationServers/WebSecurity/iw3_tut/NETS CAP1.HTM.
The Kerberos Network Authentication Service (V5) IETF RFC1510
The Kerberos model is based in part on Needham and Schroeder’s trusted third-party authentication protocol and on modifications suggested by Denning and Sacco. As with many Internet authentication protocols, Kerberos works as a trusted third-party authentication service. It uses conventional cryptography that relies on a combination of shared public key and private key. Kerberos emphasizes client authentication with optional server authentication. Find out more at http://info. internet.isi.edu/in-notes/rfc/files/rfc1510.txt.
Universal Resource Identifiers (URI) in WWW IETF RFC2396 (and others like RFC1630)
A URI provides a means of encoding the names and addresses of Internet objects. Currently, resource names and addresses appear in clear text. An URL (uniform resource locator) is actually a form of URI containing an address that maps to a specific location on the Internet. To visit a private URI site, you’d need to know the encoded name instead of the clear text name. Technologies like SOAP use URIs instead of URLs to keep Web site resources private. Learn more about URIs and how they compare to URLs at http://www.w3.org/Addressing/.
Security Standards You Should Know About
71
No, you can’t use all of these standards with SOAP, but you can often supplement your SOAP coverage by relying on these standards to protect other areas of your network. Two of the more interesting recent additions to my list of standards are Netegrity’s Security Services Markup Language (S2ML) and Securant’s AuthXML. Neither vendors (or associated standards groups) has released these standards, but they’ll appear on the market in the near future and may provide better protection for SOAP messages than existing technologies such as SSL. The important thing to remember is that Internet text-based remote procedure call (RPC) mechanisms provide nothing in the way of native security support, so you need to use some form of protection for your application. We’ll discuss three of the most important (and released) SOAP application security solutions later in the chapter.
If you want to find out the latest information on where the world is going with security standards (particularly the Internet), check http://www.w3.org/Security/. This page of general information won’t provide everything you need, but it’ll give you places to look and links to other sites that do provide additional material. Developers will want to get the commercial view of security at http://www.rsasecurity.com/. The RSA site covers a broad range of topics. This site is the best place to start if you want to add new physical security technologies to your applications like smart cards. You can also find out the status of IETF efforts by viewing the document at ftp://ftp.isi.edu/internet-drafts/ 1id-abstracts.txt. Finally, you can find good general information about most XML technologies, including S2ML and AuthXML, at http://xml.coverpages.org/. It also pays to look at http://afs.wu-wien.ac.at/usr/edvz/gonter/rfc-list.html because it organizes standards into functional areas. Make sure you check out the Internet RFC/STD/FYI/BCP Archives at http://www.faqs.org/rfcs/ if you ever need to find a request for comment (RFC) document quickly.
HTTP Authentication Framework Authentication, knowing whom you’re dealing with at the other end of the wire, is one of the main ways to ensure your communication is safe. Even in the early days of the Internet, a server could send a 401 (Unauthorized) response message to a client or a 407 (Proxy Authentication Required) response message to a proxy server in order to start the authentication cycle. The only problem with these early efforts is that the authentication information appeared in clear text. Someone could intercept the password information, keep a copy for later use, and modify it so the authentication failed. The HTTP Authentication Framework is associated with HTTP 1.1, both of which provide better means for securing authentication information between a client and server. One of the major reasons to use the HTTP Authentication Framework is to ensure both parties encrypt passwords and other user authentication information. So, how does the HTTP Authentication Framework help a SOAP developer? First, it tells you that you should use HTTP 1.1 whenever possible to ensure private communications. While HTTP doesn’t provide the best protection for your application, it does provide some
Ch
3
72
Chapter 3
An Overview of Security Issues for Soap
level of protection. As a minimum, you can protect all important user authentication information. Of course, the HTTP Authentication Framework isn’t a data encryption standard. Your data is still open to prying eyes—only the authentication portion of the communication is secure. The mechanism behind the HTTP Authentication Framework couldn’t be simpler. A client requests a resource such as a Web page from the server. The server sends a 401 or 407 response in place of the 200 (OK) response that it normally sends. The response also includes a challenge entry in the HTTP header that indicates the type of challenge the server poses and credentials entry that tells what type of credentials the client requires. Here’s a simple example of the entries you’ll see in the response header. HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest realm=”
[email protected]”, qop=”auth,auth-int”, nonce=”dcd98b7102dd2f0e8b11d0f600bfb0c093”, opaque=”5ccc069c403ebaf9f0171e9517f40e41”
Of course, there may be other entries in the header—these entries represent what you’ll see for the HTTP Authentication Framework. The response header begins with the HTTP version number and status, as mentioned earlier. The WWW-Authenticate entry indicates that the server is responding with a digest of information. This digest includes the URI of the realm (resource) in question. The quality of protection (QOP) entry comes next. It includes two tokens in most cases: auth indicates that this is an authentication header and auth-int indicates that this is an authentication integrity check. The nonce entry is a unique string that the server creates for each message using a special algorithm. The client uses this string to ensure that the response message is unique—not one repeated by a cracker. Finally, the opaque entry is another string generated by the server. In this case, the client uses the string within its response to the server. In short, this is the client’s encrypted identification for that session so the server knows it’s not receiving input from another party (such as a cracker). The client displays a password dialog box for the user. After the user enters a username and password, the client sends this information in encoded format to the server. The client sends the information as token and value pairs. Note that the authentication information appears as an encrypted string. Here’s a simple example of what you might see. Authorization: Digest username=”Mueller”, realm=”
[email protected] “, nonce=”dcd98b7102dd2f0e8b11d0f600bfb0c093”, uri=”/mydir/default.htm”, qop=auth, nc=00000001, cnonce=”0a4f113b”, response=”6629fae49393a05397450978507c4ef1”, opaque=”5ccc069c403ebaf9f0171e9517f40e41”
The client request message contains many of the same elements as the server response. Notice that the realm, nonce, and opaque values are the same. The QOP has the same meaning, but a client can’t use the auth-int token. The client must provide a username as
Security Standards You Should Know About
73
part of the digest that uniquely identifies the user on the server (not the client machine). The uri entry contains the name and path of the resource the client wants to access. The reason for this entry is that proxies can act as an intermediary for the client and this entry reflects the client wishes, rather than those of the proxy. The NC entry contains a nonce count, which reflects the number of times that a client has received the current nonce from the server. The cnonce contains a unique string generated by the client. The server uses this string to authenticate the client (not the user). It also helps in detecting plaintext attacks by third parties. The client only provides a cnonce value if the server provides a QOP entry. The response value contains the username and password information the server requested in encrypted format. The server validates the client against a security database. If the user information is correct, the server sends the requested resource and includes the 200-status value. (See Figure 2.1 and accompanying description for a detailed look at how HTTP encases a SOAP message.) The HTTP Authentication Framework includes the concept of realms. A developer can partition a resource and identify each partition with a different realm. The user requesting the resource will receive only the realms that he or she is qualified to receive.
Secure/Multipurpose Internet Mail Extensions (S/MIME) In the “Understanding SOAP Attachments” section of Chapter 2, we discussed how you could use MIME to create multipart SOAP messages that contain attachments. Using MIME allows you to send attachments using SOAP—even binary ones. However, it does nothing to secure the transmission. A cracker can still view the message, intercept it, copy it, and otherwise trash your communications. S/MIME is a solution to this problem. S/MIME depends on the use of digital certificates and encryption to provide security. The digital certificate technology relies on the use of a private key for encryption and a public key for decryption. The sender keeps a copy of the private key, but sends a copy of public key to anyone who needs it to read messages. Because only the sender has the private key, only the sender can generate the encrypted message in question. The encryption process ensures that a client can instantly detect any tinkering by a third party because a third party won’t have the private key required to encrypt the message again. The public key makes it easy to decrypt the message and to identify the sender during the decryption process. Using S/MIME represents the easiest way to ensure that a server and client authenticate each other and that the data they exchange remains safe. The public key is the weak point in the equation. A cracker could potentially gain access to the sender’s public key (unless you send the key by some secure method on a floppy). This means the cracker could still read the sender’s messages and use the content, but still couldn’t generate messages on the sender’s behalf. Some developers consider S/MIME less secure than alternatives such as Pretty Good Privacy (PGP) because of the way the protocol handles encryption and decryption. The fact that the keys remain static means that the potential for a cracker to break into your system
Ch
3
74
Chapter 3
An Overview of Security Issues for Soap
is higher than if you used dynamically generated keys. No solution is perfect, but S/MIME represents a relatively safe method of data transfer.
You may wonder why companies don’t rely on secure data exchange technologies like PGP. The reason is simple—cost. Most browsers and e-mail programs understand S/MIME without any special additions, so it’s free. You have to pay for higher security alternatives like PGP and ensure your customers use this solution as well. Availability and cost are why S/MIME is so popular.
Secure Socket Layer (SSL) SSL is a protocol that’s easy to understand. As with S/MIME, SSL relies on the use of digital certificates exchanged between a client and server. A client and server obtain these digital certificates from a third-party vendor, such as VeriSign, which can vouch for the identity of both parties. However, unlike S/MIME, the certificate exchange procedure is a little more complicated than simply sending a digital certificate from one machine to another. Here is the six-step process for SSL authentication: 1. The client sends the server an unencrypted random message along with its VeriSign issued certificate (which contains the client’s public key). VeriSign encrypted the certificate using its private key. Because everyone has VeriSign’s public key, the server can decrypt the certificate and check it for accuracy. Also, because the certificate was encrypted using VeriSign’s private key, no one can forge a public key of their own— they have to get it from VeriSign. 2. Once the server confirms it’s received a valid certificate and public key from the client, it tells the client to send an encrypted version of its original message. 3. The client computes a digest of its original random message and then encrypts it using its private key. 4. The server uses the client’s public key to decrypt the digest. 5. The server compares the decrypted digest to a digest it generates from the random message originally sent in unencrypted form by the client. 6. If the two digests match, the server authenticates the client. The use of additional encryption and verification steps makes SSL slightly more secure than S/MIME. SSL doesn’t rely on a private and public key pair, it uses a private key only. That’s the reason this protocol requires additional handshaking steps. It’s also the reason that crackers would have a harder time discovering a means to decrypt the message—there isn’t any public key to steal. SSL offers other advantages to developers. It’s transport protocol independent, which means you can use it in many different environments. The client and server can negotiate a cryptographic and encryption algorithm that both support, which means SSL is platform neutral
User Identification Issues
75
for the most part. Even though client authentication occurs using asymmetric, public key, cryptography, data exchange occurs using symmetric, private key, cryptography that changes on a per session basis. Finally, the protocol is reliable—data undergoes an integrity check before the protocol accepts it as valid.
User Identification Issues A user name and password is normally the only form of access identification that a user needs to gain access to your server. In most cases, that’s all you really need to ensure that the data on your system remains secure. However, even if you make the software that grants access to the system totally bulletproof (an ever more difficult task as computers gain processing power), there are still problems with this approach. Obviously, the most common problem is lost passwords. Asking the user to provide a password that’s difficult to break often results in a password that’s also difficult to remember. Consequently, the network administrator ends up spending more time fixing lost password problems. Crackers and employees alike can also compromise passwords in a number of ways. The least subtle scenario is the user who writes their password down on a notepad, then places the note near the keyboard or monitor. Believe it or not, this happens on a regular basis. At one company, I found a single notepad that contained not only the administrative assistant’s password, but the manager’s password as well. Even if everyone at a company does their best to protect security, and you have the best security available for your application, crackers often find ways around the security measures you have in place. Just the fact that your security relies on a password means that someone can guess the password given enough time and opportunity. In other words, using the name and password method of security is problematic at best. You can’t rely on a password to provide total security of some types of very confidential data. Fortunately, you have some alternatives when it comes to the password dilemma. The following sections talk about the two most popular alternatives: smart cards and biometrics. Smart cards rely on a something that looks like a credit card with internal memory. The network administrator encodes the user’s password on the smart card, which the user swipes in a special reader. Biometrics rely on unique body parts such as the iris or fingerprints for identification. Both methods incur extra expense, but represent the best available in security today.
Using Smart Cards One of the more common security methods today is the use of smart cards. A smart card is a credit card-sized device with some processing power built in. The user swipes the card in a special reader to gain access to the network. What happens is that the card provides a digital certificate that identifies the user in lieu of a password. One of the nice things about using a smart card is that you don’t have to provide any identification like the numbers used by credit cards on the outside.
Ch
3
76
Chapter 3
An Overview of Security Issues for Soap
One company that’s starting to make smart-card readers a standard option is HewlettPackard (HP). They have optimized their Vectra Desktop PCs and Kayak PC workstations to use either the Gemplus (a Veridicom product) or Schlumberger smart-card readers. In addition, the ProtectTools corporate security strategy offered by HP extends from the client to the firewall, which enables a company to give outside partners secure access to company data. The HP TopTools Management software allows managers to view clients that have smart-card readers. In essence, there’s already an infrastructure and products in place to allow a company to implement full security using smart cards. A smart card does eliminate the problem of compromised passwords—at least without assistance. Without a password lying around, it would take a concerted effort by an inside party to break network security. Such an effort would be easier to trace than a break-in by an outside source. However, a user can still lose a smart card. Anyone who knows what the card is used for and who it belongs to can gain access to that person’s secure resources. Unfortunately, you still can’t stop the user from placing some form of identification on the card, even if the issuing company doesn’t. In addition, the issuing company has to bear the cost of issuing and maintaining the cards, which could be an expensive proposition.
Using Biometrics There’s a relatively new alternative to both passwords and smart cards. Biometrics is a statistical method of scanning an individual’s unique characteristics to ensure that they are who they say they are. Some of the scanned elements include voiceprints, irises, fingerprints, hands, and facial features. The two most popular elements are irises and fingerprints because they’re the two with which most people are familiar. The advantages of using biometrics are obvious. Not only can’t the user lose their identifying information (at least not very easily), but also with proper scanning techniques the identifying information can’t be compromised either. There are two main problems with using biometrics: quality and prices. Some managers don’t want to use biometrics because of the time it takes to create quality scans that ensure absolute accuracy. The quality issue is being resolved as computer hardware gets faster. In addition, the price for a single scanner can range from $100.00 for something simple to thousands of dollars for complex solutions. Considering the number of access points a typical company has, the price of using biometrics for all but the most stringent security requirements could be prohibitive. Obviously, you have to weigh the cost of the data you’re trying to protect against the cost of protecting it. Fortunately, the tide is turning for biometrics as more companies start to pursue this solution. Today you’ll find biometrics in use in a wide range of government and high-security institutions. Many industry experts predict that forward-thinking financial organizations will begin using biometrics identification by the end of 2002. By 2003, many corporations will view fingerprint identification as the access method of choice for remote communication. Iris identification may begin to overtake fingerprint identification as early as 2004 as the primary means of identification used by some corporations.
Where Do You Go From Here?
77
A few notable companies create biometrics solutions today. One of the most popular systems, the IriScan System 2100, can scan the iris of an individual using a camera instead of direct contact. Find out more about IriScan, Inc. at http://www.iriscan.com/. If your company prefers to use fingerprints instead of other biometrics techniques to enforce security, you might want to look at the fingerprint reader chip solutions provided by Veridicom, Inc. (http://www.veridicom.com/). They’re working with Intel to incorporate interfaces for biometrics into the Common Data Security Architecture (CDSA). CDSA is a comprehensive set of security services that will make secure transactions easier. It has a four layer architecture: application, layered services and middleware, Common Security Services Manager (CSSM) infrastructure, and security service provider modules. Find out more about CDSA at http://www.opengroup.org/security/cdsa/. IriScan, Inc. and Veridicom, Inc. aren’t the only companies getting into biometrics. Compaq computer has introduced a sub $100 fingerprint reader that it developed with Identicator Technology. You can find out about this and other Compaq biometric technologies at http:// www.compaq.com/newsroom/pr/1998/pr070798b.html. A few other biometric product dealers include Saflink (http://www.saflink.com/), Keyware Technologies (http://www.keyware.com/), Visionics (http://www.faceit.com/), and Ethentica (http://www.ethentica.com/). All of these companies provide various biometric products that can make your network just a bit more secure from prying eyes.
Where Do You Go from Here? Most of the security software that you can use today is vendor-specific and proprietary, which means that you’re stuck looking at security through one vendor’s eyes. The problem with this approach is that you may not find a single vendor who can address all of your security concerns. Sure, most server operating systems today include a wealth of security features, but today’s computing environment doesn’t limit itself to just one isolated company. In a world-based Internet economy, you need the ability to secure a wide range of computer equipment from a large number of security threats. The ability to make security solutions from more than one vendor work together is no longer a luxury—it’s a necessity. Security vendors see this problem as well and are working to make their software interoperable, using a number of initiatives and APIs. The latest initiative to hit the market is Common Content Inspection (CCI) API, a project started by Aventail, Corp, Finjan Software, Ltd., and Check Point Software Technologies. This group joins others like Adaptive Network Security Alliance (ANSA) and Open Platform for Secure Enterprise Connectivity (OPSEC) Alliance. The goals of these three groups vary slightly, but the main goal is to make it possible for more than one vendor’s products to work together. The largest of these groups is OPSEC Alliance with 260 members (as of this writing). The CCI API seeks to promote interoperability using common interfaces for inspecting content. In other words, each vendor’s software might provide completely different features and
Ch
3
78
Chapter 3
An Overview of Security Issues for Soap
capabilities, but they would be able to work with the same data because the means for inspecting it are the same. The resulting specification is supposed to work with data at whatever level it appears, including firewalls, antivirus checks, and mobile code. You can find out more about CCI API at: http://www.stardust.com/cciapi/ and http://www.finjan.com/ cciapi.cfm. OPSEC Alliance is providing a similar set of APIs through Convent Vectoring Protocol (CVP). The difference, in this case, is that they’re concentrating on a single firewall product, Check Point’s Firewall-1. In addition to content, OPSEC is working on APIs that allow various vendor products to work together at the application level. At the time of this writing, the OPSEC Alliance offering is in place and even includes a certification program. You can find out more about OPSEC at: http://www.opsec.com/. Like the OPSEC Alliance, ANSA caters to a specific vendor product. In this case, the target software is Internet Security System (ISS) Group’s intrusion detection software. ANSA currently has 40 companies participating in the standardization effort. Unlike the other two security interoperability efforts discussed in this section, ANSA offers a software development kit (SDK) that contains Adaptive Network Security (ANS) modules designed to make interoperability easier. The ANS modules will allow ISS Group intrusion detection software to issue alerts to firewalls, extranet management applications, and other software as soon as it detects an intrusion. These other applications would then deny system access to the intruder. You can find out more about ANSA at http://www.iss.net/.
Case Study Sometimes security is more a matter of determined cantankerous behavior than spiffy new technologies. Consider the case of a company that we’ll call NoGo Corporation (to protect the guilty). The company did all of the right things. They added a firewall to their system. The network administrator studiously applied every operating patch as soon as the vendor found problems and issues fixes. The company had a security policy in place and held regular training sessions for users. Finally, the network administrator consistently audited the system for unneeded user accounts and squelched them. It sounds like a formula for security success on any network. It shouldn’t surprise you that the company crackers invaded not once, but several times in a two-month period. The crackers accessed the company’s database and made several thousand customer accounts accessible to the public. This included the customer’s credit card numbers and personal information such as address and telephone numbers. In short, this fiasco cost the company a lot, even though it did everything by the rulebook. The problem is that crackers don’t use a rulebook. They use determination and time-tested procedures to break into computer systems. Crackers don’t care about your rulebook and don’t intend to follow it. The administrator for this company made the mistake of saying his system was secure during an interview by a magazine. The magazine made a mistake by publishing the network administrator’s comment verbatim. Once the word was out on the street, crackers felt challenged and
Case Study
79
wanted to show the world that they could break into any system. The truth is that computers are so complex today that crackers can break into any system. A smart developer doesn’t rely on technology alone to keep a system secure because technology is relatively easy to overcome. In addition to saying too much, the administrator further complicated things by providing less than stellar network monitoring. A good network administrator assumes that crackers will get into his or her system and looks for signs of entry. A good developer provides the administrator with monitoring tools that provide a full view of the network without impinging on customer privacy. In addition, developers today need to use standardized protocols that an administrator can update outside the application. This enables the administrator to fix problems without getting the developer involved every time. NoGo Corporation eventually overcame the embarrassment and loss of customers that the security breach caused. I assume the network administrator found work elsewhere. The company now incorporates monitoring as part of their security strategy. A few workers may feel big brother is watching them, but customers feel more secure and that’s the bottom line. Ch
3
CHAPTER
4
Using SOAP to Create a Simple Application In this chapter Introduction
82
An Overview of Microsoft’s SOAP Toolkit An Overview of the Application
83
98
Shortcuts for Creating SOAP Applications Quickly Understanding Namespaces, the Short Version Creating the Server Side Code Creating the Client Code
106
Testing the SOAP Application Handling SOAP Errors
104
109
116
Performance Concerns for all Applications Project
120
117
102
100
82
Chapter 4
Using SOAP to Create a Simple Application
Introduction SOAP, as mentioned in previous chapters, is a specification for a wire protocol. It enables you to transfer data between machines in a distributed application. In addition, the designers of this specification intend to make SOAP easier to use than other wire protocols, while overcoming some of the difficulties in using these older protocols over the Internet. We’ve also looked at the format of a SOAP message and some of the special features you need to consider while designing your application. This chapter completes the theoretical part of the book and gets into some hands on examples. We’ll look at the last piece of the puzzle, the application that use SOAP to transfer data from one point to another. I’ll present a simple example using the Microsoft SOAP Toolkit, one of many available on the market right now.
This chapter concentrates on the Microsoft SOAP Toolkit because I feel it will be one of the more popular toolkits on the market. However, you should be aware that there are many SOAP toolkits available and we’ll look at some other selections as the book progresses. You should select the SOAP toolkit that best meets your company’s needs. In some cases, the Microsoft SOAP Toolkit is the worst possible choice (at least as of this writing). For example, you can’t use it for PDA development. In short, there isn’t a “one size fits all” choice in the world of SOAP today, so it pays to know which toolkits offer specific features. I also designed the example in this chapter to fill in the gaps in your theoretical knowledge of SOAP. Most SOAP applications are more complicated than the one here.
The first section of this chapter, “An Overview of Microsoft’s SOAP Toolkit,” acquaints you with the Microsoft SOAP Toolkit. We’ll discuss toolkit features and some of the problems that you’ll experience when using them. This toolkit is central to many of the examples in the book, so you need to know as much as possible about it at the outset. The second (An Overview of the Application), fifth (Creating the Server Side Code), and seventh (Testing the SOAP Application) sections of the chapter contain the example application. We’ll talk about how to put a basic SOAP application together, how the server side of the application works, how a client requests services, and finally, how to begin testing most applications. The third section of the chapter, “Shortcuts for Creating SOAP Applications Quickly,” provides tips you can use to get your application out the door faster. Each tip helps you get a little more out of your development environment and produce applications with fewer bugs. In addition, we’ll talk about ways that you can improve overall efficiency of a team development effort. SOAP applications rely on namespaces to find resources and external code. The fourth section of the chapter, “Understanding Namespaces, the Short Version,” tells you how namespaces work within a SOAP application. This section isn’t a complete tutorial—it provides background information we’ll build upon later. You’ll see namespaces used throughout the rest of the book in various situations and will gain a better appreciation for them as you work with the examples.
An Overview of Microsoft’s SOAP Toolkit
83
No matter how well you design an application, it’s going to have bugs. Testing complexity is a problem when working in a distributed application environment. The “Handling SOAP Errors” section shows you how to trap errors and handle them. More importantly, it tells you how to diagnose and squash bugs that error-handling routines report as part of their normal operation. Writing an application is one thing; making it run fast is another. The “Performance Concerns for all Applications” section of the chapter provides tips for making SOAP applications run faster. This section contains general tips that you can use for any application. Finally, the last section of the chapter will take you through the steps for a simple project of your own. It’s important to start simply with SOAP. You’ll find that SOAP is an easy protocol to understand, but you need to learn tricks of the trade as part of the learning process. This project is your first step in that learning process. It builds upon all of the other information presented in this chapter.
An Overview of Microsoft’s SOAP Toolkit The Microsoft SOAP Toolkit is one of many offerings on the Internet right now. My feeling is that many developers will choose either the Microsoft SOAP Toolkit or the Apache Toolkit (http://xml.apache.org/soap/) as their first development option because both of these companies have a large following and provide a reasonable level of support. Of course, many developers might have more than one toolkit in their arsenal and for good reason—I don’t believe any one toolkit will do everything you need for some time to come. I plan to cover other toolkits as the book progresses, but let’s begin with the Microsoft SOAP Toolkit for now. This section of the chapter discusses several important topics. We’ll begin by looking at the features of the Microsoft SOAP Toolkit. It’s important to know what this toolkit can do for you. A second section will tell you which types of applications you can develop using this toolkit. The next section will look at the other side of the coin—where does the Microsoft SOAP Toolkit fall down on the job? Finally, we’ll discuss how to install the Microsoft SOAP Toolkit and how to use it to create an application. We won’t discuss usage in detail—detailed information appears along with the example we create in this chapter.
This chapter relies on the Microsoft SOAP Toolkit to create all of the example code we’ll visit. If you want to participate in the examples, you’ll need a copy of the Microsoft SOAP Toolkit from http://msdn.microsoft.com/soap/. In some cases, you may need more than just the Microsoft SOAP Toolkit to get the job done. You can find a general list of Web Services resources at http://msdn.microsoft.com/ webservices/. You might want to spend some time reading about the Web Service Description Language (WSDL) because the Microsoft SOAP Toolkit relies on WSDL at both the client and server end of the application. The WSDL file documents the functionality provided by a component, such as method calls. You can find out more about WSDL at http://msdn.microsoft.com/xml/general/wsdl.asp.
Ch
4
84
Chapter 4
Using SOAP to Create a Simple Application
Toolkit Features This section acquaints you with the Microsoft SOAP Toolkit features. Currently, the Microsoft SOAP Toolkit doesn’t have much to offer in the way of support. You’ll find that it includes a help file, some Visual Basic samples, WSDL generation tools, Microsoft SOAP Messaging Object Generator Visual Basic add-in, and the DLLs you’ll need for client and server support. It also includes some example ASP files you’ll need to provide a “listener” on the server. (A listener is an application that runs continuously and looks for client requests.) Future versions of the Microsoft SOAP Toolkit are supposed to contain full support for Visual C++, including header files and example programs. Hopefully, the next edition will also include a more complete help file as well—the current offering is sparse to say the least. The “Problems You’ll Experience” section that follows discusses these issues in more detail.
Some of you might still have the Microsoft SOAP Toolkit version 1.0. The list of features for version 2.0 differs significantly from Microsoft SOAP Toolkit 1.0, which relied on ROAP.DLL and Service Description Language (SDL) instead of the current setup, which relies on WSDL or various types of low-level access. Because of the massive changes from version 1.0 to 2.0, the examples in this chapter won’t work with the older version of the Microsoft SOAP Toolkit.
Let’s talk about the important pieces of the Microsoft SOAP Toolkit from an application perspective. Table 4.1 provides a list of the DLLs provided in the Microsoft SOAP Toolkit. This table tells how to use these DLLs within an application.
Table 4.1
DLLs provided with the Microsoft SOAP Toolkit
DLL Name
Description
HLSC10.DLL
Contains the HTTP Library Connector. It allows you to verify information in the HTTP, XML, and SOAP portions of a message. For example, you can read or write SOAP messages, or check for SOAP errors. You won’t access this DLL directly often because most applications you create will use the high-level application programming interface (API) access method. The only time you need this DLL is if you want to perform a low-level SOAP message receive and send in situations where you require absolute message transmission control. While the low-level API access method does provide superior flexibility, it also incurs far greater development time and more room for error in formatting the SOAP message. This is a high-performance connector library for use with Windows NT and Windows 2000 only.
MSSMO.DLL
The Microsoft SOAP Messaging Object (SMO) Framework allows you to work with SOAP messages as you would any other object within Visual Basic. This DLL contains all of the classes required to work with SOAP message parts. Using this feature helps you to create SOAP messages without spending a lot of time learning XML. In short, the SMO Framework repre sents the least time-intensive method for creating applications.
An Overview of Microsoft’s SOAP Toolkit
Table 4.1
85
Continued
DLL Name
Description
MSSMOGen.DLL
You’ll normally use this DLL with the MSSMO.DLL. It contains the wizard used to generate SMOs. We’ll talk more about the Microsoft SOAP Messaging Object Generator add-in later in the chapter.
MSSOAP1.DLL
This is the first SOAP DLL that you’ll use on a high-level API access client or server. The Microsoft SOAP Library contains a number of classes, but you’ll always begin by creating either a SoapClient or SoapServer object. These objects support a single method call, Init(), that allows you to tell SOAP which WSDL file to use. Other classes allow you to perform tasks such as serializing data for output and reading input. You’ll see better how this works later in the chapter.
WISC10.DLL
This DLL contains the Windows Internet Connector Library. It provides essentially the same services as the HTTP Library Connector described earlier. You’ll also use this DLL for low-level SOAP message receive and send. This lower-performance library works with Windows 98, Windows Me, Windows NT, and Windows 2000.
WSDLGen1.DLL
The WSDL generation tools use this DLL to create a WSDL based on a server-side component you create. Normally you won’t need to access this DLL if you use the wizard to create your WSDL file. However, it’s nice to know that you can also create WSDL files on-the-fly if necessary by using this DLL. It contains classes that enumerate the interfaces within a component and write the associated information to a file. You can also examine interface information as part of using this DLL.
XHSC10.DLL
This DLL contains the XML HTTP Connector Library. It provides essentially the same services as the HTTP Library Connector described earlier. You’ll also use this DLL for low-level SOAP message receive and send.
You’ll also find an MSSOAPR.DLL file in the resources folder for the Microsoft SOAP Toolkit. This file contains resources for SOAP use in general and normally you won’t need to reference it directly.
One of the first places you should look when experiencing client-side errors in your SOAP application is the Microsoft XML Library. My workstation has three versions of this library installed. Although the library versions are clearly marked, you can select the wrong one by mistake. The current version of the Microsoft SOAP Toolkit relies on version 3 of the Microsoft XML Library found in MSXML3.DLL. Microsoft is already preparing version 4 of the same library that will appear as MSXML4.DLL.
As you can see from Table 4.1, the library that you’re going to use most often is the Microsoft SOAP Library. It contains everything you need to begin a conversation between client and server using the high-level API. What the table doesn’t show is that you’ll also need the latest version of the Microsoft XML Library. This library doesn’t appear in the
Ch
4
86
Chapter 4
Using SOAP to Create a Simple Application
same directory as the rest of the SOAP files. The Microsoft SOAP Toolkit installation program automatically adds the latest version of the Microsoft XML Library to your System or System32 directory. On those few occasions when you do need low-level API access, make sure you include one of the three low-level API access libraries in Table 4.1. All the sample programs provided with the Microsoft SOAP Toolkit rely on the Windows Internet Connector Library. This doesn’t make the other selections any better or worse, it simply means you’ll spend a little additional time figuring out how to use them. It helps that all three libraries work about the same way, contain about the same classes with the same methods, and that you could theoretically use any of the three with the same boiler plate code.
Three Types of Microsoft SOAP Toolkit Application SOAP isn’t a single application protocol. You can use it in a variety of ways and the Microsoft SOAP Toolkit doesn’t support all of them. The toolkit does provide support for the following three application types. (I’ve placed them in order of complexity from least to most complex.) ■
SOAP Message Object (SMO)
■
Remote Procedure Call (RPC) (also known as High-Level API)
■
Low-Level API
All of these techniques require that you create a listener. The complexity of the listener reflects the method you want to use for access and the complexity of the application itself. You’ll see how listener complexity varies as the book progresses. Two of the methods—SMO and high-level API—require the creation of WSDL files. Only the low-level API method avoids this requirement. Fortunately, you can generate the WSDL files with relative ease using one of two WSDL generation utilities (see the “WSDL Generator” section for details). The SMO method is the easiest and least flexible method of creating a SOAP application. The client side of the picture is almost too easy. All you need to do is create an ActiveX DLL project and use the Microsoft SOAP Message Object Generator add-in to create the required component entries. Include this component within your client-side application code to make accessing the remote object easy. In fact, with careful planning, neither the server-side or the client-side code will require much change. The high-level API technique is just a little harder than the SMO method. The Microsoft SOAP Toolkit actually supports two forms of this project. You can use simple parameter calls or rely on an XML document to define the interface for your application. In both cases, the server-side component is unchanged from anything you created in the past. (You’ll have to create a server to handle the XML document details, should you decide to use an XML document in place of simple parameter passing.) The listener works with a WSDL
An Overview of Microsoft’s SOAP Toolkit
87
and WSML file to create the required interface and perform the required data translations. The client-side application does require modification to use this technique—you must make it SOAP aware. The low-level API technique is the most difficult to implement, yet provides the cleanest SOAP implementation. This method also provides the greatest flexibility because you have full control over every aspect of the communication. Both the client and server must provide SOAP instructions. The application modules not only initiate the SOAP environment, but create the individual lines of code passed on the wire using a serializer as well. You’ll have to write this kind of application from scratch and you won’t be able to rely on existing code for very much. In addition, modifications can become quite difficult because you need to recompile the application after every change.
Problems You’ll Experience The Microsoft SOAP Toolkit is a work in progress. For that matter, all of the toolkits that we’ll work with in this book are works in progress, so you need to take proper precautions when working with them. The most important issue, of course, is to ensure you keep vendor specified limitations in mind as you work with the product. Some people are already using these products to create production applications. Obviously, you’ll need to weigh the benefits of using SOAP against the problems of using a less than stable development environment and should prepare to make changes to your code later. Eventually, if everything works right, the output of any toolkit that you choose will be compatible with any other toolkit. Most of these toolkits will also provide a wealth of application development features—many are feature rich already. You may even find that you can eventually choose a single toolkit to meet all of your needs. However, there aren’t any perfect toolkits on the market at the moment and most developers will find they need more than one to get the job done. One of the more interesting problems with the current implementation of the Microsoft SOAP toolkit is that it doesn’t work particularly well with Visual C++. This is an interesting problem because Microsoft normally emphasizes Visual C++ in its toolkits. Version 2 of the toolkit offers little in the way of Visual C++ example code, libraries, or header files. Creating a Visual C++ application using the Microsoft SOAP toolkit is substantially harder than other options you may have. You may want to spend some time looking at the toolkit documentation and working with the example code before committing to using this toolkit in your next Visual C++ development project. In fact, we’ll use the 4S4C Toolkit in Appendix C to create a Visual C++ SOAP application. Another problem that Microsoft plans to fix in the next version is the lack of support for complex data types and structures. For example, you can’t pass a user defined type (UDT) using the current version of the toolkit. (I show how to get around this problem in the “Using Complex Data Types” section of Chapter 8.) This means that you will find yourself looking at other toolkits when Microsoft doesn’t provide enough complex data type support for your needs.
Ch
4
88
Chapter 4
Using SOAP to Create a Simple Application
You’ll want to work with Windows 2000 with Service Pack 1 installed for your early development work. The packaging says that the Microsoft SOAP Toolkit works fine with Windows 98, Windows ME, and Windows NT with Service Pack 6 installed, but many people experience problems when working with these other platforms. None of the example programs included with the Microsoft SOAP Toolkit work on a twomachine setup. Microsoft assumes that every developer using this toolkit works on only one machine and independently as well. Unfortunately, this is an extremely poor development environment. While you can learn SOAP on a single machine, make sure you develop using two or more machines to avoid many “single machine thinking” errors. I eventually modified the sample code in the Microsoft SOAP Toolkit to run on a multiple machine setup in order to see how it would run in the real world. The Microsoft SOAP Toolkit currently offers nothing in the way of cross-platform support and I wouldn’t expect this to change. We have already discussed the issue of compatibility problems with other toolkits in the “SOAP and the Web Server” section of Chapter 1, but Microsoft intends to address this issue. You should be able to team the Microsoft SOAP Toolkit with another toolkit such as the Apache SOAP Toolkit to provide some level of cross-platform support. The problem, of course, is that now you’re working with two different toolkits and will likely find implementation differences. These differences can only serve to slow your development efforts.
Interoperability is a major concern for all SOAP toolkit vendors. However, the reality of interoperability may take a long time to realize. You can track current interoperability problems on the Yahoo SOAP list server at http://groups.yahoo.com/group/ soapbuilders. While this group may not solve all your interoperability problems, they’re a step in the right direction.
PDAs are becoming more important as remote users begin using them for their daily needs. Currently, the Microsoft SOAP Toolkit offers nothing in the way of support for PDAs. Microsoft has committed to offering some level of support for PDAs in the future, but not with the current toolkit release. Expect to see some form of PDA support in Microsoft SOAP Toolkit Version 3. In the meantime, you’ll want to use another product such as pocketSOAP or IdooXoap for your PDA needs. You’ll read about some of these toolkit alternatives in Chapter 10. Earlier, I mentioned the requirement to use WSDL with the Microsoft SOAP Toolkit version 2.0. It turns out that the toolkit doesn’t provide full WSDL support yet, but it does provide enough for most uses. For example, you’ll find that you can’t use the WSDL tag yet. While a future version of the Microsoft SOAP Toolkit provides full WSDL support; you can’t use it fully now.
You can get around some of the file support problems in the Microsoft SOAP Toolkit by using third party products such as XML Spy. You’ll find a complete discussion of XML Spy in Appendix C, “Third Party Tool Reference.” You still need to generate the XDR
An Overview of Microsoft’s SOAP Toolkit
89
files for an SMO application by hand or use a product such as BizTalk Server (see Appendix B) for Version 2. Hopefully, Microsoft will add an XDR generation tool for Version 3 of the Microsoft SOAP Toolkit.
The SOAP specification is still liquid—developers continue to suggest changes that affect how SOAP works. Expect to see new problems pop up in Microsoft’s toolkit as the specification changes. In addition, Microsoft is almost certain to add features to their SOAP toolkit that will extend its functionality. In short, you’re going to run into additional compatibility problems. How extensive the problems become is up to you. If you choose to use only the features contained in the SOAP specification, your applications should maintain a high level of compatibility with applications created by other developers. A potentially crippling problem with the Microsoft SOAP Toolkit is that it doesn’t send type information with the variables that it outputs. The reason that I say this problem is only potentially crippling is that the SOAP specification doesn’t require vendors to provide type information. We discussed one result of the lack of type information in an earlier chapter— Microsoft’s solution doesn’t work well with Apache servers. However, more devastating is the lack of support for variants. Without type information, the Microsoft SOAP Toolkit can’t support variant types of any kind. Ch
Many distributed applications cross human language boundaries today, which means that you need to consider language compatibility along with other issues when developing an application. Consider, for example, a developer who created a SOAP application that works fine in English only or Dutch only. However, when the English version tries to contact the Dutch version (or vice versa), the application generates an error. The English version transfers Boolean values as “True” and “False” while the Dutch version uses “Waar” and “Onwaar.” The SOAP toolkit treats the two sets of values differently, making them different values as far as the SOAP application is concerned. Another common language-related problem is the representation of numbers. Some languages use the comma for the decimal portion of a number, while others use the period. SOAP toolkits often experience problems translating between the two numeric representations. A value of 99,9 in German may become 999 in English (instead of 99.9). In many cases, SOAP won’t generate an error when making an incorrect numeric translation, but the output of the application is incorrect (which means you have to check all cross language calculations carefully). SOAP toolkit vendors will obviously work on these problems as they become apparent, but you need to be aware of language issues as you develop applications.
The point of this section is that Microsoft’s SOAP Toolkit has problems. Many of these problems will disappear with time and others will take their place. SOAP does offer the ideal of making every application compatible; something that you really need when developing distributed applications. The reality of the SOAP world is that you’ll find a higher degree of compatibility than ever before, but making your application work with someone else’s application will still require cooperation from both parties.
4
90
Chapter 4
Using SOAP to Create a Simple Application
Using the Microsoft SOAP Toolkit The Microsoft SOAP Toolkit contains two important tools that you need to know about. The first is the WSDL Generator that you’ll use with every application. This tool creates an XML description of your component that clients will use to access it properly. The second tool is the Microsoft SOAP Messaging Object Generator Visual Basic add-in. You’ll use this tool within the IDE to create client-side components that can interact with a server-side component by generating SOAP messages. The advantage to using the add-in is that you don’t need to know how to write XML. The examples in this section of the chapter rely on a simple component that I created specifically for this task. All this component does is add two numbers together and return a result to the client. We’ll also use this component in the simple SOAP application example later in the chapter. I won’t present the source code for the component in the chapter to save space, but the source code does appear on the Que Web site for this book. You can find it at www.quepublishing.com. You’ll find both the source code and the compiled component in the Chapter 04/Sample Component/directory. This directory also contains all of the ancillary files required for the application, such as the WSDL, WSML, and ASP files. Make sure you copy the compiled version of the component to your server. Once you have it placed in a directory that you want to use for testing, make sure you set security to provide access to the component. We’ll look at what you’ll need to do to make the component accessible to Internet Information Server (IIS) later in the chapter. For now, all you need to do is register the component on both your local workstation and the server so that you can access it. Type RegSrv32 AddIt.DLL at the command prompt and press Enter. If the component registration works, you’ll see a success message like the one shown in Figure 4.1. Figure 4.1 RegSvr32 displays a success message after you register a component.
You can unregister the AddIt component by typing RegSvr32 -U AddIt.DLL at the command prompt. Make sure you unregister the component before you remove it from the server. Otherwise, the registry will continue to point to a nonexistent location and you could receive unexpected error messages.
Now that you have some idea of what’s happening, let’s look at the two tools in the Microsoft SOAP Toolkit. The following sections look at the two tools in detail. We’ll talk about fixing the Microsoft SOAP Toolkit examples first, however, so that you can try them out on two machines.
An Overview of Microsoft’s SOAP Toolkit
91
Using the Microsoft SOAP Toolkit Examples on Two Machines Before we go too far, let’s talk about one important issue that you’ll want to consider when using the Microsoft SOAP Toolkit. Most developers will want to try the Microsoft SOAP examples on two machines before they create their own projects. As stated earlier, a deficiency of the current toolkit is that it assumes you want to use a single machine for development purposes. The best way to set the Microsoft SOAP Toolkit up for two-machine use is to install it onto a drive on the Web server from the local workstation. That way, the installation program will add all the required SOAP toolkit entries to the local workstation, but the files will reside on the server. Make sure you actually install all of the SOAP examples—this is an optional part of the installation process (as normal for all Microsoft development tools). Once you’ve installed the toolkit, create a virtual directory to the \Program Files\ MSSoapSDK\Web directory as specified in the Microsoft documentation for the samples. You’ll also need to register all of the DLLs in the \Program Files\MSSoapSDK\Binaries directory manually using RegSvr32 on the server. Remember that there is one special upgrade that the Microsoft SOAP Toolkit installer performs that you wouldn’t spot right away. The inclusion of the Microsoft XML Parser version 3.0 is important because your SOAP applications won’t run without it. (You’ll get strange error messages; some of which will state the application couldn’t create an object, but the error message will never state which object.) Of course, your workstation has the XML Parser upgrade because you installed the Microsoft SOAP Toolkit on it, but the server may not have this feature. Make sure you copy the three XML Parser 3.0 files (MSXML3. DLL, MSXML3A.DLL, and MSXML3R.DLL) to the System32 directory on your server. You only need to register the MSXML3.DLL file using RegSvr32—the other two DLLs provide support functionality. At this point, your server is setup, but the Microsoft examples still won’t work. The sample files assume that you’re using LocalHost, not a remote Web server. The fastest way to update the code is to open each source file, look for instances of LocalHost, and replace them with the name of your Web server. In addition, you’ll need to create new WSDL files (found in the \Web directory) for each of the examples. The “WSDL Generator” section that follows shows how to do this the easy way. You can check the results of your efforts by opening one of the ASP files found in the \Web directory. Figure 4.2 shows the output of the CalcVB.ASP file. Notice that we’re still receiving an error message. However, it’s also important to note what type of error message we’re receiving. In this case, the message indicates that the error occurred internally because there’s a detail section. In addition, the error says that it couldn’t process your request. Since we didn’t provide any input, there was nothing to process, so the error message makes sense. This quick test shows that the server is ready to process messages. Of course, you could still run into any number of application errors. For example, the WSDL files can contain incorrect namespace information or you can run into other problems in locating and using the component. All this test does is check the server connection.
Ch
4
92
Chapter 4
Using SOAP to Create a Simple Application
Figure 4.2 Check your server setup by viewing the results of an ASP file request.
WSDL Generator The Microsoft SOAP Toolkit provides WSDL support in the form of WSDL Generator. It provides a graphical interface like the one shown in Figure 4.3. The WSDL Generator will work with a COM component that includes a type library. The type library is essential since the WSDL generator uses it to provide a description of the component interfaces and methods. Figure 4.3 The WSDL Generator utility allows you to create WSDL files.
As you can see from Figure 4.3, the WSDL Generator begins by asking for two entries. The first is the name of the service you want to create, while the second is the physical location and name of the DLL that you want to access. I normally use the name of my class as the
An Overview of Microsoft’s SOAP Toolkit
93
service name, but you can use any name you like. After you make the required entries, click OK. If you’ve entered the right DLL location, you’ll see a Select the services you would like to expose dialog box similar to the one shown in Figure 4.4. Figure 4.4 Use the Select Interface dialog to choose which component interface you want to access from the application.
Select an interface and individual methods within the interface, then click Next. WSDL Generator will ask you to provide the URL of the listener and select a listener type as shown in Figure 4.5. (You’ll see an example of a listener later in the chapter.) In most cases, you’ll want to keep the XSD Schema Namespace set for the current year, unless you’re working with an older application that relies on an older specification. Figure 4.5 Part of the WSDL generation process is to select a listener type and XSD Schema Namespace year.
Make the listener entries, and then click Next. WSDL Generator will ask for the physical location and name of the WSDL file as shown in Figure 4.6. This also determines the location and name of the WSML file. Both of these files should appear in the same directory as the ASP listener (when implemented) that you create for ease of use. Notice that you can also choose between a UTF-8 and UTF-16 file format for your WSDL and WSML files. The UTF-8 file is easier to read with a text editor, but the UTF-16 file provides better language support.
Ch
4
94
Chapter 4
Using SOAP to Create a Simple Application
Figure 4.6 Determine the location of the WSDL and WSML files for your service.
Select a location and file format, then click Next. You’ll see a success message stating that WSDL Generator created the file. The only time you’ll see a failure message is if you don’t have sufficient rights to create a file in the target directory, or some other external problem occurs. Note that although the message only says that you’ve created a WSDL file, you’ve also created a WSML file.
Microsoft SOAP Messaging Object Generator The Microsoft SOAP Messaging Object Generator is a Visual Basic add-in that you can use to create simple SOAP applications with little effort. This add-in should appear in the AddIn Manager dialog box after you install the Microsoft SOAP Toolkit on your development workstation (use the Add-Ins | Add-In Manager command to display the dialog box). Figure 4.7 shows an example of what your Add-In Manager dialog box will look like. All you need to do is check Loaded/Unloaded and click OK to begin using this add-in. Figure 4.7 A view of the Microsoft SOAP Messaging Object Generator add-in within the Add-In Manager dialog box.
An Overview of Microsoft’s SOAP Toolkit
95
Remember that the Microsoft SOAP Messaging Object Generator works with the ActiveX DLL project. You’re creating a component that hides all of the interface details of a SOAP application. The developer then uses the component as a resource in the client application where you work with SOAP as you would with any other object. The two-part approach does make development easier and the application more modular. Let’s look at what you’ll need to do to create the component part of an SMO application. The following steps will help you create a typical component. 1. Load the Microsoft SOAP Messaging Object Generator add-in if necessary. Start the add-in using the Add-Ins | Microsoft SOAP Messaging Object Generator add-in command. You’ll see an SMO Generator—Introduction dialog box. 2. Click Next. You’ll see a SMO Generator—Specify Schema dialog box like the one shown in Figure 4.8. Figure 4.8 You’ll need to provide schema information that the SMO Generator can use to create the component framework. Ch
4
3. Click Browse. You’ll see an Open dialog box. The SMO Generator can theoretically work with XML, XDR, XSD, BIZ, and WSDL files. We’ve already seen how to create WSDL files using the WSDL Generator in the “WSDL Generator” section of the chapter. You could also create one of the supported file types manually or use the tools supplied with other Microsoft products such as Microsoft BizTalk Server (see Appendix B for details). We’ll select an XDR file for this example.
For the purposes of this example I’ll use a custom XDR file that I created using XML Spy. You’ll find this XDR file in the \Chapter 04\Sample Component directory on the Que Web site for this book. You can find it at www.quepublishing.com.
4. Select a file and click Open. You’ll return to the SMO Generator—Specify Schema dialog box.
96
Chapter 4
Using SOAP to Create a Simple Application
5. Click Next. You’ll see an SMO Generator—Specify Namespace dialog box like the one shown in Figure 4.9. This is where you’ll type the Uniform Resource Name (URN) of your organization. (See the “A Question of Globally Unique Identifiers” sidebar in Chapter 5 for more details about URNs.) Notice that the URN shown in Figure 4.9 consists of three parts: the moniker urn, the URN actually issued to your company by a third party, and the name of the component class. The only entry you normally need to change is the company URN. Figure 4.9 Entering the right URI for your component is essential—make sure you enter the right company URN.
6. Type a URI for your component, then click Next. You’ll see an SMO Generator— Select Elements dialog box like the one shown in Figure 4.10. Note that the name and number of elements you see will vary by the construction of the source data file. For example, you may see different results if you use a WSDL file than if you use an XDR file. In most cases, you’ll find that a WSDL file provides less ambiguous results than an XDR file will. 7. Select one or more of the top-level elements, then click Next. You’ll see an SMO Generator—Prepare Project dialog box. Note that the wizard will automatically check the Delete “Class1” Class Module option because you don’t need this class in most cases. Uncheck this option if you have already added global code to the Class1 module. 8. Click Next. You’ll see an SMO Generator—Change Class Names dialog box as shown in Figure 4.11. Note that this dialog box allows you to change the name of the classes that the Microsoft SOAP Messaging Object Generator creates. Depending on how you generate the source file and the names chosen by any tools you use, you may need to change the class name to make it more readable. In this case, I changed the class name to AddItAccess because that name is a lot more specific than Root, the name found in the XDR file I used.
An Overview of Microsoft’s SOAP Toolkit
97
Figure 4.10 After you’ve entered all of the required preliminary information, you’ll see a list of available elements.
Figure 4.11 Use an easy to decipher class name, even if this means changing the default entry shown in this dialog box.
9. Click Next. You’ll see an SMO Generator—Finish dialog box. 10. Click Finish. The Microsoft SOAP Messaging Object Generator will automatically create all of the required classes for you.
Ch
4
98
Chapter 4
Using SOAP to Create a Simple Application
At this point, the wizard has completed its work. Depending on how well the source file (component description) defined your component, you should be able to compile this DLL without any change and begin using it for your client. All you’ll need to do is include this component and begin programming your SOAP application using standard dot syntax.
An Overview of the Application The example application is going to be a simple server-side component that adds two numbers together. The client will provide two integers as input and expect an integer as output. In other words, this application won’t do anything amazing. My intent in creating it is to show you some of the basics of creating a SOAP application using the simplest means possible. Rather than wade through reams of source code, you’ll see SOAP in its simplest form. The purpose, of course, is to make it easier to understand the harder applications that appear later in the book. The following sections will help you understand various SOAP application elements. The first section shows how differences in SOAP application requirements will affect your programming methodology. Next, I’ll show you how SOAP applications exchange data in general. Finally, we’ll look at the data flow for this application in particular.
How SOAP Applications Differ It’s important to use the right techniques when creating a SOAP application because using the wrong technique can result in many wasted hours reinventing the wheel. In general, SOAP applications differ in several ways. Here’s a list of what I consider the most important criteria. ■
Application Complexity: As application complexity increases, so do development problems. You may need to rely on something other than simple parameter passing methods when working with complex applications.
■
SOAP Toolkit Limitations: We talked about the kinds of applications you can create with the Microsoft SOAP Toolkit in the “Three Types of Microsoft SOAP Toolkit Application” section of the chapter. The Microsoft SOAP Toolkit won’t provide every solution you need. You’ll definitely want to look at all of the available solutions when designing a SOAP application.
■
Platform: If your SOAP application has to operate on more than one platform, you’ll need to find one or more toolkits that work with each other as part of your development solution. For example, the Microsoft SOAP Toolkit doesn’t work with an Apache server, so you’ll need to find an Apache specific solution if you need to work with both Microsoft and Apache servers. Likewise, PDAs all require individual toolkits—there aren’t any solutions that work with all PDAs.
■
Access: The type of resources your application requires affects the way you develop it. Some developers feel that SOAP is ill equipped to handle any form of database application. Others feel that SOAP isn’t secure enough for sensitive data. It’s important to consider the resources your application requires before you decide on a design method.
An Overview of the Application
■
Client Needs: The client doesn’t affect your SOAP application as much as it affects binary solutions, but problems remain. You need to consider how the client interacts with the server because some clients require more input than others. For example, the infamous data-type problem affects clients as well as servers.
■
Partnerships and Customers: One of the reasons that people are excited about SOAP is that it allows easy access to company resources by partners and customers. Of course, this access also presents challenges to the developer. You need to consider additional security and other problems that will affect your application design.
■
User Needs: Applications today have to support all kinds of users. Some users require a PDA, while others can get by with a laptop computer. You’ll also need to consider the special needs of users with special challenges, such as the visually impaired.
99
This list contains the issues you’ll need to consider most often. You may also have to consider special needs for your company. For example, some applications will need real-time (or nearly real-time) access to data. This requirement affects the way you build an application and could affect your choice of tools. The main point of this section is that SOAP actually increases the diversity of applications you can create. No, you won’t create a desktop application with SOAP, but SOAP applications could contain some of the same functionality that used to appear with desktop applications alone. For example, administrators used to worry only about LANs. Now they need to consider organizations with a worldwide scope. The desktop application they used to manage the network no longer fits the bill—they require a distributed application of the same type. SOAP applications vary in so many ways that careful design will now include walkthroughs and careful research. Product decisions are more critical than ever before—a poor choice will cost more. Security hazards are greater; user requirements are greater than at any other point in history. SOAP is a tool with a lot of potential to solve the significant problems you face today.
Basic Application Design and Data Flow Some people will have the wrong view of SOAP from the outset because prior development experiences and the features of their toolkit of choice influence them. Some developers view SOAP today as a two-way, real-time, synchronous protocol. This is the view that many SOAP toolkits espouse today because SOAP is still in its infancy and there are few specifications to guide developers. The SOAP specification doesn’t contain any such limitations. A SOAP communication is a one way protocol. A client sends a message to a server. There’s no need for the server to respond to the request, nor does the server have to handle the message on any deadline. This simple view of SOAP leaves a lot of room for innovation and some developers are already stepping up to the challenge. Future SOAP toolkits will very likely allow asynchronous one way communication. These toolkits may offer queued message transmission (Microsoft is already working on this feature) and some toolkits may offer transactional support. If you’re looking at SOAP in a specific way today, be prepared to see things differently tomorrow.
Ch
4
100
Chapter 4
Using SOAP to Create a Simple Application
How does the reality of SOAP affect application design and data flow? You normally need to design a SOAP application with the one-way transmission in mind. A request/response scenario, therefore, actually requires two one-way transmissions. The first sends a request to the server, while the second sends a response to the client. Both server and client require a serializer to create the message, a listener to receive the message, and a parser to read the message. SOAP is a stateless protocol. This means that you can’t depend on the component having any given state when you send a call. You must create a new connection each time you want to communicate with the server because the Web server will consider each call a new session. As you can see, making multiple calls for one piece of data is a difficult and error prone task you should avoid. It pays to make SOAP applications that use self-contained calls.
Understanding the Simple Application in This Chapter The application in this chapter is a perfect example of the self-contained call. We’re performing a simple task—adding two numbers. While this isn’t representative of the real world applications that you’ll create, it does show how to use simple parameter passing within an application. Simple parameter passing represents the easiest SOAP application you can create. Here is the list of steps the application follows when processing the client request. 1. The user enters application data and clicks Do It. 2. The client application creates a session with the server by creating a SOAP client and specifying the name of a WSDL file to receive the input. 3. The XML Parser on the server opens the WSDL file and processes it. This creates a connection between the WSDL file and the ASP used to receive input. 4. The ASP receives input from the client and creates the SOAP server required to process it. The SOAP server uses the schema found in the WSDL and WSML files to create the required component and make the client request. 5. The SOAP server creates a response based on the output from the component. 6. The XML Parser creates a SOAP response message, which it returns to the ASP. 7. The Web server responds to the new ASP information by sending a response back to the client application. 8. The client application parses the information and displays the result for the user.
Shortcuts for Creating SOAP Applications Quickly This section contains tips that help you develop SOAP applications faster and with fewer bugs. The bugs portion of the picture is extremely important because it doesn’t pay to develop an application with lots of bugs. You need to work fast and develop an application that works properly at the same time in today’s market. You might find that not all of the tips mentioned here are applicable to you project. I’ve provided tips for many different working environments and you may work in only a few of them.
Shortcuts for Creating SOAP Applications Quickly
101
The most important tip that I can pass along is to work with a SOAP toolkit. Yes, you can write an application without relying on a SOAP toolkit, but it will take a lot longer and have more bugs in the end. A SOAP toolkit provides you with tools that you need to create applications quickly, ensures the syntax of your SOAP messages are correct (and follows the standards specifications), and provides you with some “boiler plate” code such as listeners. Every SOAP application requires a listener on the server to respond to client requests, so this is an important part of any application. Use the default serializer whenever possible. SOAP relies on an XML formatted text message to transfer data from one point to another. This means that the data must follow the current SOAP and XML specifications to work. In most cases, the rigid structure of the XML message leaves little room for creativity when it comes to the message format (the content of the message is a different story). Nevertheless, some developers are already creating custom serializers that perform special tasks. Using a custom serializer opens your application to bugs and compatibility problems. The vendor creating the default serializer for the toolkit you use has already extensively tested it against the specification, so using it whenever possible eliminates one additional source of problems. Always test your application on two machines, preferably separated by an Internet connection. Some vendors continue to suggest developing an application on a single machine to reduce the amount of remote debugging a developer needs to perform. Using a local implementation of a distributed application almost never works. The resulting application is always going to have problems. Developing in a distributed environment from the outset brings these problems to light early in the development cycle when it’s less expensive in time to fix them. In addition, working with application pieces on local machines in a team environment will almost certainly result in problems when you put the application together. Local application development may be easier, but it’s also one sure way to create problems in the end. Most developers know that they need to design the data portion of an application first, which includes creating any required databases, indexes, and views. However, when using SOAP, you also need to design the message format of your application as part of the data design process. Make sure you create a design that will work with partner applications if your distributed application will affect more than one company. When a company already has a message format in place that doesn’t agree with your own design needs, then you’ll have to rely on technologies such as eXtensible Stylesheet Language Transformations (XSLT). Using XSLT allows you to transition between message formats with a minimum of programming. Of course, you can always write an application to perform the transition as well, but this is a labor-intensive method with dubious benefits for the developer.
Like everything else dealing with SOAP, you’ll find that XSLT is in a state of chaos as standards committees hammer out the final details of a specification. You can find out more about the XSLT specification at http://www.w3.org/TR/xslt. Find out about the eXtensible Stylesheet Language (XSL) at http://www.w3.org/TR/xsl/. Some interesting related technologies include the XML Information Set (http://www.w3.org/ TR/xml-infoset/) and XML Path Language (http://www.w3.org/TR/xpath).
Ch
4
102
Chapter 4
Using SOAP to Create a Simple Application
We’ll discuss most of these technologies as the book progresses, but it pays to do some research up front. Fortunately, you don’t have to become a guru in any of these technologies to work with SOAP and use the sample applications in the book.
Use the wizards whenever possible. The WSDL Generator makes short work of creating a WSDL file for your component. In fact, I wouldn’t consider creating an application without this handy utility. Using the SMO Framework and associated Microsoft SOAP Messaging Object Generator add-in for Visual Basic makes client application easier. Of course, like any use of a wizard, the ones you find in the Microsoft SOAP Toolkit do tend to hide implementation details and are somewhat rigid when it comes to developing an application. If you really need to create a low-level API application to maintain full control over the environment, then be sure to allow a lot of additional development time.
Understanding Namespaces, the Short Version Libraries, DLLs, and other forms of external code storage have always presented problems because finding the particular method or function that you need is time consuming. The method and function declarations appear to have no discernable order and you might find two related functions in different DLLs. In short, using libraries and DLLs in the past required the developer to memorize the location of certain functions. While many developers did end up creating their own “internal library” of functions and their location, the process for creating this library is expensive in terms of time. Namespaces provide a means of bringing order to the world of library and DLL method storage. Yes, the functions and methods of the past are still stored in libraries and DLLs, but now you can find them much easier. In addition, you normally don’t need to memorize their location. You can find the methods and functions with relative ease using a browser or other tool that relies on the namespace hierarchy to do its job. We’ve already seen how SOAP uses namespaces in the examples in Chapters 1 and 2. Applications such as Windows Explorer use the namespace concept to organize objects like drives and file folders. Namespace extensions provide methods of interacting with these objects. Future versions of Visual Studio will also rely on namespaces. The Common Language Runtime (CLR) that Microsoft plans to introduce with Visual Studio.NET relies on namespaces to organize methods. Figure 4.12 provides you with a glimpse of how CLR will eventually present functions for your use. In short, namespaces are a great organizational tool. You can view all namespaces as a hierarchical database. A top-level container holds one or more object containers. Each object container can hold one or more methods and one or more additional object containers. The hierarchy continues until the entire content of a DLL appears somewhere within the hierarchy. Look again at Figure 4.12 and you’ll see the hierarchical organization of objects and methods. The objects appear in the left pane, while the methods associated with the selected object appear on the right. Vendors always organize namespaces in this manner, but you won’t always see them displayed as shown in Figure 4.120. The tool you use for viewing a namespace determines the method of presentation.
Understanding Namespaces, the Short Version
103
You gain access to a particular method by presenting the path to its location in the namespace hierarchy. Look again at Figure 4.12 and you’ll see that you can access the CloseHandle() function by requesting it like this: Microsoft.Win32.Win32Native.CloseHandle()
Figure 4.12 CLR will rely on namespaces to organize methods for developers.
Ch
4 Some application environments use a dot syntax as shown above, or they can use other characters such as the colon (:) that SOAP relies on. In all cases, you need to provide an entire path to access the desired method. Most vendors provide some type of shortcut for accessing the path such as the Imports statement used by Visual Basic.NET. Once you define a path to the desired method, all you need to do is reference it by name within your code, which greatly reduces the amount of typing required to create an application. Let’s get back to SOAP namespaces. There are two sets of namespaces to consider. The first set appears within the SOAP message. You’ve already seen these namespaces in action in the examples in previous chapters. The SOAP toolkit vendor defines the second set of namespaces that you’ll use within your application. As of this writing, most SOAP toolkit vendors provide two namespaces as part of the toolkit. The first namespace is for the SOAP serializer (a component that organizes the data you want to output into the SOAP message format). The serializer namespace contains all the methods you’ll use to create requests and responses. For example, a startBody() method call creates the message body, while an endBody() method call ends it. The second namespace is for the SOAP fault mechanism. It contains methods for creating the four SOAP fault elements that we discussed in the “SOAP Fault Messages” section of Chapter 2. You’ll rely on two sources for the SOAP message namespaces. The first source is defined by standard organizations to provide standard SOAP message features. For example, the
104
Chapter 4
Using SOAP to Create a Simple Application
Web site defines standard features for the SOAP envelope. Figure 4.13 shows an example of the fault namespace as defined by the standards organization at the time of writing. http://schemas.xmlsoap.org/soap/envelope/
Figure 4.13 The SOAP specification defines sources for standard SOAP namespace definitions.
You’ll also define namespaces of your own that define access to components on the server. The SOAP toolkit usually provides a utility to create the files required to define the namespaces you’ll access from within your application. We looked at the process for using a custom namespace as part of the application code within Chapters 1 and 2. Essentially, you define a local variable that references the Web site with the namespace definition. The local variable then serves as the path to the component methods that you want to access. This is obviously a quick tour of namespaces for SOAP, but we’ll review this topic often as the book progresses. You can’t create a SOAP application without knowing something about namespaces. I feel this is a topic best learned by doing, so I’ll expand your knowledge of namespaces in the example applications that follow.
Creating the Server Side Code You’ll normally design and test the server side code first. The reason is simple—you need a functional server before you can do anything with the client. There are many ways to design the server-side code, but I normally begin with the component. That way I can verify operation locally and ensure the rest of my server code won’t suffer from a faulty component during the debugging process. Creating a listener comes next. We’ll use an ASP file in this case. The ASP file should contain code that passes input from the client to the SOAP sever and then passes the output from the SOAP server back to the client. Most listeners will also contain some form of error handling code. This code creates a SOAP fault message and passes it back to the client. The final piece is creating the required WSDL and WSML files. We’ve already performed this step in the “WSDL Generator” section of the chapter, so I won’t repeat that step here. You’ll want to verify the server setup, at this point, by running a small test discussed in the “Designing a Listener” section.
Designing the Component The server-side component for this example could be any component that you created in the past. Nothing that you’ve done in the past will change. The server-side component is
Creating the Server Side Code
105
totally unaware of any interaction with SOAP because of the way that SOAP calls on the component for services. Here’s the code for our component. Public Function DoAdd(Add1 As Integer, Add2 As Integer) As Integer ‘Calculate the result. DoAdd = Add1 + Add2 End Function
All you need to do is create a new ActiveX DLL, add a function to it, and add this code. You’ll need to check the Unattended Execution and Upgrade ActiveX Controls options in the AddIt—Project Properties dialog box. Make sure you set this up as a single-threaded DLL. Compile the code as a DLL. You can test the component locally if you want, using a small test application. Move the component, the server, and register it using RegSvr32 when you finish. Obviously, most components are more complex than the one shown here, but the process you’ll follow is essentially the same in all cases.
Designing a Listener Microsoft doesn’t provide any automated tools for creating a listener. However, you can use the listeners provided in the Microsoft SOAP Toolkit as examples of what you need to do. In fact, these listeners will work with simple applications with just a few modifications on your part. A simplified form of the ASP file we’ll use follows. This version doesn’t include error handling. (I’ll talk about error handling in the “Handling SOAP Errors” section of the chapter.)
Ch
4
As you can see, the listener is relatively simple. It begins by creating a partial response message that states the type of content the server will return. The ASP then creates a SOAP application (if necessary), maps a path to the WSDL and WSML files, and finally initializes the SOAP server. The last step is to invoke the SOAP server with the request information and pass the response back to the client. SOAP listeners are normally more complicated
106
Chapter 4
Using SOAP to Create a Simple Application
than this, but not by much. All that the SOAP listener does is pass information to the appropriate place. Once you have the listener, WSDL, WSML, and component in place, you can run a quick test to ensure that you’ve set your server up correctly. Simply log onto the Web server and select the ASP page. You should receive an error message stating that you haven’t provided input variables like the one shown in Figure 4.14 shows the error message you should receive. Notice that the error reflects an inability to load the XML parser (MSXML). Figure 4.14 Perform a quick test on the server installation to ensure you have the application set up correctly.
Creating the Client Code You’ll begin creating the client by starting a new Standard EXE project in Visual Basic. Figure 4.15 shows the form that you’ll create. Table 4.2 provides settings for this form. As you can see, it uses a simple form that contains just enough entries to show the input and output of the component, and allows the user to initiate the SOAP call.
Table 4.2
AddIt Client Application Form Settings
Control
Property
Setting
frmMain
BorderStyle
Fixed Single
Caption
AddIt Example
Height
2000
Creating the Client Code
Table 4.2
107
Continued
Control
Property
Setting
Width
5200
TabIndex
1
Text
10
TabIndex
2
Text
5
txtResult
TabIndex
5
cmdDoIt
Caption
Do It
TabIndex
3
Caption
Quit
TabIndex
4
txtAdd1
txtAdd2
cmdQuit
Figure 4.15 This is the simple form we’ll use for the clientside application. Ch
4
The client-side code needs to know about the SOAP connection when employing the technique used in this application. However, the client-side programming requirements are light. The simple client code used for this example follows. Option Explicit Private Sub cmdDoIt_Click() ‘Create the SOAP client. Dim Client As SoapClient
108
Chapter 4
Using SOAP to Create a Simple Application
‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/AddIt.wsdl”, _ “AddIt”, _ “AddItPortType” ‘Perform the addition. txtResult.Text = CStr(Client.DoAdd(CInt(txtAdd1.Text), _ CInt(txtAdd2.Text))) Exit Sub ‘Display a message when an error occurs. ErrorHandler: MsgBox Client.faultstring, vbExclamation End Sub Private Sub cmdQuit_Click() ‘Exit the application End End Sub
The example begins by creating a SOAP client. The SOAP client creates the connection between the client and the server. You call the mssoapinit() method and supply a minimum of one argument, the location of the WSDL file. Our example also supplies a service name and the name of the service port. The SOAP server uses the first service and port number if you don’t supply these values. The mssoapinit() method also accepts a fourth argument— the location of the WSML file. Never supply this value unless you want to provide custom schema handling.
The arguments for the mssoapinit() method call get passed to the server precisely the way you type them into the application. If a server provides case-sensitive handling of data from the client, then the call will fail if you don’t use the right case. It always pays to use the same case for these arguments as you find in the WSML file.
So, where do you get the service name and service port number? The best place to look is in the WSML file. Here is the WSML file for this example.
Testing the SOAP Application
109
Notice the and tags on the fourth and sixth lines. These two tags define the names that you’ll provide within the client application. This is another reason to generate the WSDL and WSML files before you begin writing the client application. The WSML file will always contain the information you need and in the proper case. Now that we have a connection to the server, let’s get back to the remaining lines in the client application. The next step is to call the DoAdd() method within the component. This line may look a little confusing because of all the conversions taking place. Essentially, we’re making a straight call to the component, just as you would for any other application. The remaining lines of code in the listing provide simplistic error handling and a means to exit the application.
Testing the SOAP Application Developing SOAP applications means learning new techniques. I’m sure that most of you didn’t have to worry about using a serializer with your LAN applications, but it’s an essential part of a SOAP application. Fortunately, once you know what a SOAP message should look like, creating one isn’t as hard as it might first appear. The programming is relatively simple, in fact, because the vendors creating the SOAP specification went to extremes to make simple code a reality. Testing, however, will never get easier. I remember thinking that testing applications on the desktop was difficult. Moving to the LAN brought an unprecedented level of complexity, but it was still manageable. Testing a SOAP application, unfortunately, means checking things that you never thought to test before. No longer can you assume anything about the application environment. A SOAP application that works on your local machine is no longer enough. Even testing across the LAN in your company is no longer enough because you never know where your application will end up in the future. The following sections are going to try to take some of the pain out of SOAP application testing for you. We’ll begin by looking at how you can test the simple application we’ve created in this chapter. The hands on time you spend with simple applications at the outset will greatly reduce the learning curve for complex application in the future. The next section will provide you with some tips about SOAP application testing. You’ll want to read this section carefully because distributed applications present problems that could become terrifying in scope. Finally, we’ll look at an interesting Web site you can visit to test your server for SOAP compatibility. Testing your server against a standard is important because it ensures that your application will work with the broadest range of external servers possible.
Ch
4
110
Chapter 4
Using SOAP to Create a Simple Application
Checking the Example Application for Errors At this point, you can run the client application. If you enter two numbers, click Do It, and get a result back from the server, everything is working fine. This is one case when you’ll want to test the application both inside the IDE and out. The IDE can actually interfere with the operation of your application in some cases, especially when you’re in a debugging mode.
One of the more common problems new developers run into when trying to get an example to run on two machines is not registering the components on the server. Installing the Microsoft SOAP Toolkit automatically registers the required components on the local workstation, but does nothing for the server. You must register all of the DLLs found in the Binaries directory using RegSvr32 .
If you do run into problems, the first thing you’ll want to do is check the client to ensure it’s transferring data to the server in the correct format. I’ll show you how to do this using the tcpTrace tool in Appendix C. However, there are other resources at your disposal, such as network monitor. Make sure you check your application from multiple machines and offsite if possible. Sometimes you can’t check your system from an offsite location with ease. That’s where the SOAP Message Builder (http://www.soapclient.com/soapmsg.html) comes into play. Figure 4.16 shows what this site looks like. Figure 4.16 The SOAP Message Builder allows you to simulate outside access of your Web site.
Testing the SOAP Application
111
As you can see, you enter your server address, SOAP action, and SOAP message as a minimum. The SOAP message should contain the same information that you expect your client to send. You can obtain the message from the tcpTrace diagnostic screen. It’s just a matter of cut and paste. Just click Execute when you’re ready to test your application. The Web site also provides a sample that you can generate by clicking Generate at the bottom of the page (not shown in Figure 4.16). The sample gives you a better idea of how this Web site works. The result is that you should get the same result from your server using an external source as you would from an internal source, if external access is an important element of your application. Likewise, you can also use this site to test security. Given a valid message and other parameters, this site can answer the question of whether someone outside your company can access your application with ease. The Generic SOAP Client (http://www.soapclient.com/soaptest.html) is another valuable testing tool. In this case, you make a call to your server just as you did with the client application earlier. However, in this case you’re checking external access. Figure 4.17 shows what this tool looks like. Figure 4.17 The SOAP Client Builder allows you to simulate outside access of your Web site.
Ch
4
As you can see, all you need to enter is the URL for the WSDL file on your server and the intended response type. Click Retrieve when you’re ready to perform the test. The testing tool takes care of the rest for you. Like the SOAP Message Builder, this tool provides a simulation of external access and can point out potential security problems with your setup. Note that this site also provides a sample application that you can use for learning purposes.
112
Chapter 4
Using SOAP to Create a Simple Application
SOAP Testing Tips Distributed application development represents a new obstacle to testing. Consider for a moment that the application isn’t on the desktop or even the LAN anymore. The application communicates across a network of unknown quality and there’s a potential for problems that you never considered in the past. In addition to the problems of communication, a distributed application today may execute on more than one platform. In other words, the operating system is no longer a constant, so you can’t depend on any particular set of features on the other end of the data connection. Finally, you can’t even depend on a given level of user training. Users from all over the world could end up using your application. In short, the development environment for all developers today is more complex than the commercial (shrink-wrap) development environment of yesterday. Since today’s developer can’t depend on a specific environment, testing takes on a nightmarish appearance at times. Newsgroups now contain impassioned pleas from developers who have no clue at all on how to debug a remote application. In many cases, the testing procedure is arcane and relies on outdated techniques, such as sending test messages back and forth to see if you can get anything to work. With this in mind, let’s look at some simple tips for reducing test complexity. ■
Always test your application on a LAN using two machines at the outset. Yes, I’ve said this before, but it’s an important issue. Creating an application on the local machine and testing it there results in an application that runs fine locally. Microsoft and other vendors espouse this dangerous technique because it makes it easier for developers to feel they’ve accomplished something more quickly. In the end, however, a developer only hurts the final product by testing locally.
■
Once you get an application running on LAN, test it across an Internet connection, even if you have to set up a test site from your home to the office. Your LAN is a safe environment where messages are unlikely to get lost and the connection is sturdy. The Internet provides none of these things. If you deploy a distributed application on the Internet with only LAN testing under your belt, be prepared to get lots of support calls from irate users.
■
Test your application against standards validation sites. We’ll talk about a site to test your server in the “Testing Your Server” section. In past chapters, I’ve mentioned two important Microsoft SOAP test Web sites http://www.soaptoolkit.com/ soapvalidator/ and http://www.soaptoolkit.com/soapvalidator/listener.asp. These sites are important tools in a world of SOAP development where few tools exist.
■
Check your application against more than one toolkit. The compatibility problems between the Microsoft and the Apache toolkits demonstrate the need to perform this simple check. These toolkits won’t talk to each other at present because of the way in which they implement the SOAP specification. Sure, Apache intends to fix the problem, but you’ll likely run into other problems later.
■
Run your application on more than one Web server if possible. SOAP applications depend on the features provided by Web servers to a certain extent. Make sure you
Testing the SOAP Application
113
check your application against more than one Web server. In addition, make sure you let users know which Web servers you checked so they know in advance that the application may not work with other Web servers. ■
Run your application on more than one platform if possible. Distributed applications typically need to communicate with multiple platforms. You may use all Windows 2000 Server setups on Intel machines in your company, but a partner may rely on UNIX servers that run on some other processor. It’s important to consider every potential obstacle and this is likely to be a big problem.
You’ll also find that testing your application once is probably not enough. Validating your application every time a new specification comes out may become a painful reality in the world of distributed applications. Unlike the application you run on a LAN, SOAP applications won’t run in a vacuum. Business partners may decide to update their SOAP implementations outside the normal processes for your company—making it possible that an application that worked fine yesterday will no longer work today. By now you’re thinking that SOAP is supposed to make things simpler, but this sure seems more complicated than anything you did in the past. The truth is that SOAP has made the development process a lot easier, but it looks more complex because of the increase in environment complexity. Imagine trying to make distributed applications of the kind that we’re talking about here work with DCOM or CORBA. You’d find yourself pulling out hair by the handfuls and never getting anything done. SOAP makes distributed applications possible using a standardized specification, something that all developers need in today’s connected environment.
You’ll discover that an important part of the troubleshooting process is to know precisely what information the client and server exchange. The only way to do this is to read this information after it leaves all of the processing levels of both machines. Since this is such an important requirement, you’ll want to add an easy to use tracing tool to your arsenal. One such tool is tcpTrace. We’ll discuss tcpTrace in detail in Appendix C.
Testing Your Server Testing your server is every bit as important as testing the rest of the application. You need to know that the server is compliant with the current SOAP standard. The problem is that you can’t really test your server without knowing everything there is to know about the specification. Once you were “all knowing” about the specification, you’d have to write a test suite, and then test your server—a lot of work for a test you’ll perform once. Fortunately, there’s another solution. You can use the SOAP 1.1 Validator site at http://validator.soapware.org/. Figure 4.18 shows the opening screen for this Web site. As you can see, the site generates scripts that it uses to test your server for SOAP compatibility.
Ch
4
114
Chapter 4
Using SOAP to Create a Simple Application
Figure 4.18 The SOAP 1.1 Validator Web site can test your server for SOAP 1.1 compatibility.
Using the SOAP 1.1 Validator is easy. As shown in Figure 4.19, enter the URL for your site, port number for HTTP access, and path to the test area. Notice that this site includes a default entry. I’ve used this entry to learn more about SOAP by seeing how the default site reacts to the various scripts. You can display the various SOAP messages used in the test. This allows you to see ho’w the request/response mechanism should work and can help you locate problems with your own site. Figure 4.19 Using the SOAP 1.1 Validator is easy—just enter the required site information.
Testing the SOAP Application
115
Once you enter the required information, click Submit. Be patient as the site tests your Web site. I often wondered if something had gone wrong, only to see the test results appear a few seconds later. During my tests, validation can take up to five minutes. The test time varies as a function of your connection speed and the load on the remote server. Figure 4.20 shows the type of results that you should get from the test. Figure 4.20 The SOAP 1.1 Validator provides a simple pass/fail report of your server’s compatibility.
The Web site will display several columns of information. The second column contains a description of the test. Click on the link to gain insights on test purpose and server response requirements. The third column displays OK if your server passed the test or a failure message if it didn’t. The fourth and fifth columns are the most interesting. They help you to see the test site request and your server’s response. You can use this information for both training and diagnostic purposes. It could also come in handy for testing the capabilities of various toolkits. Figure 4.21 shows a typical response message. Figure 4.21 A typical server response to a SOAP 1.1 Validator request.
As you can see, this is a textbook response similar in format to those we discussed in Chapters 1 and 2. This particular response is plain—it doesn’t include many of the attributes we discussed earlier, but it does meet the minimum SOAP requirements. Obviously, you’ll
Ch
4
116
Chapter 4
Using SOAP to Create a Simple Application
want to check your server’s response against the level of SOAP message response that your application actually requires. For example, a response should include embedded type information if you plan to work with complex data types.
Handling SOAP Errors SOAP errors come in a variety of shapes and sizes. You’ll find that debugging a SOAP application is every bit as complex as any other distributed application, especially if you don’t have the right tools. For example, finding a problem between two platforms becomes quite complex when you need to use two different toolkits. Compatibility problems are at the forefront of SOAP issues today. Many of the most frustrating SOAP problems are also the easiest to detect and fix. Developers often forget to check the simple problems before launching into a long and frustrating debugging session. For example, I found that capitalization is a major problem. Not every platform is case sensitive, but enough are that you’ll want to check for this problem before you begin troubleshooting something more complicated. If I suspect that any part of my application will have problems with case sensitivity problems, I’ll include a routine that checks for this problem before creating the SOAP server or instantiating other objects. Security is also a major showstopper with SOAP. Many developers forget to include the anonymous user, IUSR_, in the list of entities that can access the object. Of course, you won’t have this problem if you disallow anonymous access. The point is that you must include code that will report security problems accurately so the network administrator doesn’t spend hours looking for a problem that he or she could fix in just seconds. Some SOAP errors are toolkit specific. The WSDL files used in the example program contain location sensitive information. If you move the WSDL file on the server, you’ll find that your application no longer works. However, the error you get back indicates that there’s an internal server error. Finding this problem can become frustrating because the WSDL files contain a lot of information for even a small component. It normally pays to generate a new WSDL file, rather than troubleshoot the existing one, if you suspect that someone moved the file to another location on the server. One of the odd errors that you may run into is compatibility problems between versions of XML. For example, quite a few people have complained of applications that stopped working suddenly when they installed Visual Studio.NET on their workstations. It turns out that Visual Studio.NET upgrades the XML version installed on most machines. One of the errors that this new version of XML looks for is the position of the XML tag within a document. Previous versions of XML would allow the tag to appear in the middle of the document. This new version requires that the XML tag appear on the first line. Of course, this is just one of many new errors that the new version of XML will check. If you’re having problems getting a SOAP message transferred across the wire after a tool update, it pays to check for these XML formatting errors. In many cases, you’ll get an error message that has nothing to do with the actual problem, making it difficult to find and fix.
Performance Concerns for all Applications
117
Speaking of XML errors—make sure you check for attribute errors as well. For example, many people have run afoul of the encoding attribute in the XML tag. Using UTF-8 encoding with a machine that doesn’t provide the proper support results in a completely unreadable message. Likewise, you need UTF-8 support when working with certain languages—the old Industry Standards Organization (ISO) standbys (such as ISO-8859-1) won’t work any longer for applications with a world view. Reading the SOAP Specification You’ll undoubtedly spend some time reading the SOAP specification during the development process and again during the testing process. This book attempts to bring some clarity to some of the muddled sections of the specification. It’s not a replacement for the specification, but it will help you understand the intent of some areas. One of the more confusing areas of the specification is the use of the SOAP Fault element. We talked about the composition of this element in the “SOAP Fault Messages” section of Chapter 2 and will talk about it again here. This particular element has caused more than a little discussion on the various newsgroups that I frequent because faults can appear in so many areas. Reporting an error properly requires knowledge of how the SOAP Fault element works. The SOAP Fault element can only appear once within the body of a SOAP message. You can’t create more than one fault entry and the fault entry normally appears by itself without any other SOAP message data. Some newer SOAP developers create one SOAP Fault element for each error, causing the client to reject the error message and leaving the user in the dark as to the cause of a problem. Figure 2.3 shows a perfect example of a SOAP Fault element. The specification also places limits on which sub-elements you can use to report certain types of messages. This is one place where even veteran developers make mistakes. You always include a faultcode and faultstring element as part of the SOAP Fault element. However, developers only use the detail element when the error appears within the body of the message. If the error appears within the header of a message, you use only the faultcode and faultstring elements. An even more important consideration about the detail element is that the server often uses it to report component errors. If you look at Example 10 in the SOAP specification, you’ll notice that the faultcode and faultstring elements report a general server error. The detail element reports the specific component failure that caused the server error. Since the detail element is used for a specific component in this case, you’ll notice that the namespace points to that component. So, where do you place details about a SOAP header fault? They should appear in the SOAP header, if you want to include any at all. The SOAP specification doesn’t require SOAP header error details and any implementation is up to the sender of the fault information. Unfortunately, the lack of a specific description of header fault errors means that everyone is free to interpret them as they see fit. I think you’ll find that each toolkit uses a different method for reporting header errors. Hopefully, the SOAP committee will fix this oversight in future specifications. The point of this exercise is clear. Make sure you look at the examples if you use the specification to figure out a design problem in your application. In some cases, the write-up isn’t enough to tell you precisely what you need to know to solve a problem. This isn’t an unexpected problem since the SOAP specification is new. Many of these areas left to the interpretation of the reader will disappear as the SOAP specification matures.
Performance Concerns for all Applications Performance is troublesome when working with SOAP. As mentioned in Chapter 2, SOAP has some performance inhibiting problems right now that I doubt a standards committee will fix anytime soon because fixing them would add to SOAP’s complexity. Advanced features such as connection and resource pooling are only available where the complexity of
Ch
4
118
Chapter 4
Using SOAP to Create a Simple Application
the underlying protocol allows for such features. This is counter to one of the design goals of SOAP, which is to reduce the complexity of the programming environment for those situations where a complex protocol isn’t required. SOAP applications respond to the same types of code optimization that all applications do. For example, you want to perform as much work as you can outside a loop because performing work within a loop is expensive. Some optimizing compilers will even look for this common programming flaw and fix it within the executable for you. However, the fact remains that you must perform some types of optimization yourself—an optimizing compiler just can’t perform the required analysis. In addition to the common code fixes, SOAP also responds to some fixes that reflect the method used to create a SOAP message, decode it, or act upon its contents. That’s where this section of the chapter comes into play. The following sections provide some handy tips on how you can make your SOAP applications run faster. Not every tip will work in every situation. You’ll want to test the technique with your application, see how it reacts, and then use it if applicable. I’ve used simple demonstration examples for the sake of clarity. Obviously, your code will contain complex interactions that you’ll want to check thoroughly. Getting all of the fat you can out of your application not only makes the application run faster, but could make it more reliable as well.
Attributes Versus Elements One of the concerns that many developers have is the issue of using attributes versus elements in their code. I’ve been using elements for all of the examples in the book because they’re easier to read. In fact, you should use the element approach whenever possible in your code for the same reason—to make the code easier to read. Here’s an example of a simple SOAP message using the attribute approach. Note that the MyObj:GetPerson tag attributes normally appear on one line, rather than multiple lines as shown. 1
If you check this example at the SOAP Message Validation site at http://www.soaptoolkit. com/soapvalidator/, you’ll find that it validates just fine. This model also works in real life. The advantage of using the attribute approach is performance. It takes less time to parse this
Performance Concerns for all Applications
119
kind of message than messages that rely on the element approach, which means you see a small performance boost on a per client basis and a larger increase in overall server performance. As previously mentioned, one down side of the attribute approach is readability. Some developers also feel that the attribute approach is less flexible. For example, you can’t provide type information when using the attribute approach. We discussed one problem with a lack of type attributes in Chapter 2—you may find that servers like the Apache server won’t parse the message properly.
Code Optimization Performance is a function of the code you use to create the application. Sure, you can rely on an optimizing compiler to do some of the work for you. For example, if you need shortcircuiting of the comparisons in an if statement, then an optimizing compiler is the tool of choice. The developer can’t really optimize the code, in this case, because there isn’t any way to know if short-circuiting will work until runtime. The compiler needs to insert the short-circuiting logic. However, there are times where the coding technique you use determines how well your code will run. Consider the situation where you have to make a decision using structures or simply use smart coding techniques to allow the application to make the decision automatically. Developers use three common techniques to handle a situation where you have to take a different action based on the input type of a variable. ■
Create a series If...Then statements that test for the type of the variable and take appropriate action.
■
Use a Select Case statement to perform the comparison and perform the action within a case.
■
Create a class containing the code and use overloading to perform the comparison automatically.
All of these techniques work, but they’re based in part on procedural thinking. Sometimes, all you really need to do is perform a type cast or use an existing method to perform the work you need to do. Here’s a very simple example of using smart coding to reduce the amount of time required to make a decision. Private Dim Dim Dim
Sub Command1_Click() iTest As Integer sTest As String bTest As Boolean
CheckType (iTest) CheckType (sTest) CheckType (bTest) End Sub Public Sub CheckType(vType As Variant) MsgBox “The input type is: “ + TypeName(vType) End Sub
Ch
4
120
Chapter 4
Using SOAP to Create a Simple Application
I could have used the If...Then statement, a Select Case statement, or class to perform this task. However, in this case, using something simpler is easier to code, easier to understand, and works faster as well. This is an example of a programmer specific optimization that you can’t perform using other techniques—an optimizing compiler won’t help you in this situation. The reason these techniques are so important in SOAP is that SOAP already has several performance hits against it. The method you use to serialize data, encode and decode messages, and interact with it on the server all affect the performance of your application to a great degree. Small changes to coding style can make big performance differences when it comes to working with SOAP.
Project Now that you have a better idea of how a simple SOAP application works, it’s time to put some of that knowledge into practice. We began this chapter with a simple component that could add two numbers together. However, it might be useful to try a component that can do more. The following steps will provide an overview of an add-on to this project. Going through these extra steps help reinforce what you’ve learned (or at least point out places where you need additional study).
Depending on how you set up your server, you may need to shut down the Web service and restart it to ensure the component you worked with earlier is unloaded. In some rare cases, developers have reported the need to restart their machine in order to clear memory completely. You’ll also want to clear the temporary Internet files from your test browser so that you don’t receive false test indications.
1. Open the AddIt project. 2. Add methods for multiplying, dividing, and subtracting two numbers based on the code we used for adding two numbers. 3. Copy the new version of the component to your test server and register the component again. 4. Update your WSDL files using one of the WSDL generator programs we talked about earlier. 5. Modify the client application to allow for multiplication, division, and subtraction. 6. Recompile the client and test the new capabilities of your application. Notice that we didn’t modify the ASP file. Normally you won’t need to perform this step when upgrading the capabilities of an existing application since the ASP file only acts as a messenger. The WSDL and WSML files always require an update whenever you make any kind of change to your server.
CHAPTER
5
Migrating an Application from DCOM to SOAP In this chapter Introduction
122
SOAP Application Conversion Prerequisites Updating a Simple Utility Program Updating a Data Viewer
140
145
Updating a Complete Database Application Modified Application Concerns Troubleshooting
159
123
155
153
122
Chapter 5
Migrating an Application from DCOM to SOAP
Introduction For years, Microsoft has tried to convince you that COM, DCOM, and COM+ were the grail of application development—that they could answer your every need. The reality for most of us is that these technologies work great on a LAN, somewhat on a WAN, and barely or not at all on the Internet. So, at this point, you may consider doing something rash when the boss asks you to scale your application to work in a distributed environment. You contemplate the huge investment you made in Microsoft technology and realize that it won’t work in the new application environment. Don’t worry, all is not lost; you can still recover. Converting all or part of your application to SOAP may be the solution you were looking for all along. I’ll warn you in advance that this chapter isn’t designed to provide you with step-by-step instructions for converting a specific application to SOAP. There isn’t any way that I can guess much about your application and the intricacies of converting it to a new programming technology. However, this chapter will show you the basics of converting some application examples that I feel represent basic application types, and you can use the knowledge you gain to convert your particular application. The first section of the chapter, “SOAP Application Conversion Prerequisites,” helps you to understand what you need to consider before you begin converting an application. We’ll talk about such important issues as determining which modules to convert. You probably won’t want to convert the entire application, except in the most severe cases. Since you’ll be using more than one protocol in the new application, you’ll need to consider methods for avoiding protocol-related problems in the modified applications as well. There are also integrating and binding issues to consider. This section talks about all of these issues and more. The second (“Updating a Simple Utility Program”), third (“Updating a Data Viewer”), and fourth (“Updating a Complete Database Application”) sections of the chapter show three conversion scenarios of increasing complexity. I chose these three scenarios because they best represent the types of applications you’ll likely need to convert. I do build on each application as I convert it, so you’ll probably want to read all three sections. The fifth section of the chapter, “Modified Application Concerns,” tells you about the three major issues in a modified application. First, you need to consider reliability. The modified application is using new code, so you need to test it as if it were a new application. Second, since SOAP doesn’t provide the robust security of binary protocols like DCOM, you’ll need to consider security issues for the new application. You may find that security needs require you to split some modules into two parts: those that work with outside companies and those that you can only use internally. Finally, the new application won’t run as quickly as before. You need to decide if performance has become an issue and what you intend to do about the performance problems. In some cases, you may decide to forgo using SOAP because performance is too poor. If that occurs, then you need to work with partners to come up with alternatives to the problem. The final section, “Troubleshooting,” will show you how to troubleshoot converted applications based on what you’ve learned throughout the rest of the chapter. This is an especially
SOAP Application Conversion Prerequisites
123
important section because diagnosing problems in a hybrid application (one that uses both SOAP and a binary protocol) can be especially difficult.
Many developers who normally work with DCOM will need to make a quick transition to SOAP as companies make the move to distributed application development. Not every developer will need a novice-level book or a college course to start the learning process. In some cases, an online tutorial will provide all you need. Many Web sites provide these courses. For example, you’ll find a course, “XML messaging with SOAP” at developerWorks (http://www-106.ibm.com/developerworks/). Just click the Tutorial link and follow the instructions to select tutorial by name. This site does require you to register before you can use it, so it’s likely that you’ll receive additional e-mail from them.
SOAP Application Conversion Prerequisites Converting any application from one technology to another is never a small undertaking. Even simple application conversions require some level of advanced planning. Yet, some companies will try to perform the conversion ad hoc and usually end up botching the entire process. This section will help you avoid the fate of those who add SOAP to their existing applications without forethought or planning. Each of the following topics will take you one step closer to getting that application moved from the local server to a distributed environment where everyone benefits. (Of course, we’ll take the hands-on approach to the problem later in the chapter.)
Lest you think that application conversion is only for those who have nothing to risk, some major companies are already planning on conversions as I write this. One of the more interesting companies is eBay. You can read a detailed report about its SOAP application upgrade effort at http://scriptingnews.userland.com/ backissues/ 2001/03/12#microsoftAndEbay. This article specifically states that eBay will use SOAP in its upgrade efforts. Imagine being able to add eBay engine support to your next application. Sometimes the future of development boggles the mind.
An Overview of the Conversion Process It always helps to have a map when you begin a journey, and anyone who has converted an application before knows that the journey is both long and filled with trouble. Heading off into the great unknown without a map may have the appeal of becoming like the brave adventurers of the past, but remember that many of those adventurers set out, never to return home again. Unless you want to bear the scars of failed implementations, be sure you have a complete map to guide you through the process. So, where do you get a map before you begin your application conversion adventure? A SOAP application conversion map consists of several elements. The list of the elements that I feel are most crucial for success follows:
Ch
5
124
Chapter 5
Migrating an Application from DCOM to SOAP
■
A block diagram of your application.
■
A list of application conversion goals and requirements.
■
Schemas of any databases.
■
Descriptions of the interfaces between various application components.
■
Documentation of any security requirements for the system.
■
A list of end-user needs.
It’s usually a good idea to begin with the block diagram of your application. Outline areas of the application that users outside the company must access and those that only users within the company will need. For example, the code used to display data will probably require both internal and external access. On the other hand, you may not need to provide any administrative features outside the company, so these modules require internal access only. In some cases, you’ll find a component that provides both external- and internal-only uses. For example, some developers place the code required to display records in the same component as the code required create new records. You can normally gain a slight performance benefit, increase security, and create a modular application by splitting components along external- and internal-only lines. At this point, update your application’s block diagram. Add a list of interfaces between all components. You’ll need this information later as you work with SOAP support files, such as the Web Service Description Language (WSDL) files required for server-side components. Some developers will use this opportunity to create WSDL or other IDL files as part of preparing for the application upgrade.
➔
For details on using this particular technique, see the “Using Schemas as Design Tools” sidebar in Chapter 6.
It’s time to look at the interfaces—specifically at the data types used for method calls. You may find that some data types don’t translate well to the world of SOAP programming. The more complex the data type, the harder the conversion. In addition, some data types are operating platform specific. You wouldn’t want to try to pass a window handle to an Apache server that wouldn’t have any idea of what to do with it. While the SOAP specification provides well-documented methods to handle variables such as strings and integers, each vendor normally handles complex data types such as objects in unique ways. Simplifying method call data types whenever possible now will make the application conversion process easier and could net some performance gains. Of course, you have to weigh the advantages of simplification against the time required to perform the conversion. In addition, you only need to perform this task on data used externally—internal only components won’t require any change unless you plan to use SOAP for the entire application.
The SOAP specification does include provisions for arrays, enumeration, and structures. You can use these three complex data types to your advantage during the conversion process. In many cases, you can group a set of parameters in a structure in such a way that the structure elements describe the complex data type. This allows the remote
SOAP Application Conversion Prerequisites
125
machine to reconstruct the complex data type in a form that it understands. In short, SOAP provides ways around complex data types that allow you to maintain crossplatform compatibility.
Depending on how you plan to use the application after conversion, you’ll probably want to perform all of the required component upgrades at this point. For example, you’ll want to separate external-use code into individual components by function and type. As you’ll learn later, the smaller you can make the individual modules, the easier it is to apply individual security and make changes later. Compile the application and test it on a LAN using DCOM. Putting the application back together and testing the new configuration now means that you’ll have one set of debugging tasks out of the way. Using this technique also ensures you can check the new application architecture using a development technique that you’re completely comfortable with, rather than debugging the components using the SOAP. Once you’re certain that the new application design works, that the application’s block diagram is up-to-date, that all of the interfaces are set up properly, and that you’ve clearly delineated external operations, it’s time to start the SOAP conversion. This may mean creating additional components. Some businesses use one component for internal use and another for external use. For example, the external component might use a different processing technique or include additional security. However, the majority of the conversion effort will include creating listeners for the server and new clients or client-side components. If you’re using the technique we talked about in Chapter 4, “Using SOAP to Create a Simple Application,” to create your application, you’ll want to get all of your listeners set up and ready for use. Many developers choose ASP for the listener files because they’re easy to create and maintain. However, if you need the highest possible application speed and lowest resource use, then going the ISAPI route is probably best. The technique in Chapter 4 is great if you plan to write an application from scratch, but it isn’t necessarily the best choice for a conversion. I find that the easiest method for converting DCOM applications is to rely on the Microsoft SOAP Messaging Object (SMO) because it allows you to simulate the existing environment within the client. The client and server component code remain about the same using this method—the major change occurs with the listener and SMO code you need to write. Whichever route you choose, be sure you create your listeners, then create the WSDL files you’ll need to use with the application
➔
For details, see the “Microsoft SOAP Messaging Object Generator” section of Chapter 4.
One of the final tasks is creating or modifying the client. If you go the SMO route, you might need to change the component instantiation method, but that’s about it. When using the high-level or low-level API routes, you’ll need to write a new client and add existing business logic to it. You can use any supported method for client access. Many developers will use the client-side application approach we used in Chapter 4. However, you can also use Web pages and rely on a scripting language to provide the required SOAP calls. The technique you
Ch
5
126
Chapter 5
Migrating an Application from DCOM to SOAP
choose depends on client needs, user requirements, and the number of client objects at your disposal. The SOAP toolkit you choose makes a great deal of difference when it comes to client-side access. You’ll learn more about this as the book progresses, especially in Chapter 10 when we begin to work with smaller devices such as Personal Digital Assistants (PDAs). A Question of Globally Unique Identifiers Anyone who has worked with any form of COM understands the concept of the globally unique identifier (GUID). COM uses the GUID to keep track of various incarnations of components. Two components with the same name won’t interfere with each other because both components have a GUID that COM uses to differentiate them. Windows supports the GUID mechanism because it guarantees unique references. SOAP also works with components and needs some way to identify them. However, there’s a difference between SOAP and COM when it comes to using GUIDs. SOAP relies on a uniform resource identifier (URI) to locate components. The URI points to the resource, allowing the application to find it as needed. The problem is that URIs aren’t necessarily unique, which means you could run into component-naming problems. The general specification for URIs, RFC2396 (http://www.faqs.org/rfcs/rfc2396.html), states there are actually two types of URIs. The type that most people are familiar with is the uniform resource locator (URL). URLs allow you to find a Web site, but could also be used to find resources on a local area network (LAN). The second type of URI is the uniform resource name (URN). Of the two types of URI, only the URN is a managed resource. Being a managed resource means that a URN is guaranteed to be unique. Companies that want to use a URN must apply for a Namespace Identifier (NID) from an authority such as the Internet Assigned Numbers Authority (IANA). The NID appears as part of every resource reference that the company creates. You can read more about the format of URNs at http://www. faqs.org/rfcs/rfc2141.html. RFC2611 (http://www.faqs.org/rfcs/ rfc2611.html) explains how URNs guarantee unique references. Does the non-unique nature of URLs mean that you should stop using them and switch to URNs? Developers will continue to use URLs because they’re more flexible than URNs and you don’t have to go through a lot of extra paperwork to use one. However, developers should try to create unique references by using URLs they actually own. Several groups are working on solutions to the problem of unique identifiers. One such group is Resource Directory Description Language (RDDL). You can read more about this group at http://www.openhealth. org/RDDL/. The RDDL specification shows how an organization could use an URL that it owns to point to an XHTML document that contains a list of resources the company wants to make accessible. The document contains a description of the resource in human-readable form and embeds the required machine information as part of the description.
Deciding Which Application Modules to Change Let’s look more closely at the problem of deciding which application modules to change. The first factor to consider is which application modules to change based on external- versus internal-only use of application features. The second factor to consider is that you might need to change some modules to accommodate special needs such as additional security. Of course, those are decisions that you’re required to make, but understanding that you need to make them doesn’t really define the problem fully. It’s important know precisely which modules to change in the application and why you would need to change them before you begin the conversion process. After all, you don’t want to make additional work for yourself. Modifying a module that doesn’t require it can open security holes or reduce performance.
SOAP Application Conversion Prerequisites
127
The following sections will help you understand the process of choosing the application modules to change during conversion. Most companies find that this is the hardest part of the process because it’s also the time where you have to develop a conversion plan. A bad decision here will incur significant costs later in the development process.
Sometimes it’s difficult to keep up with all of the acronyms appearing on the scene today, especially with something as fast-paced as XML. Acronym Finder (http://www. acronymfinder.com/) does a good job of helping you decipher many computerrelated acronyms. It also pays to check out the XML Technology Protocol Reference at http://www.xml.com/pub/a/2000/11/01/protocols/quickref.html. Another place to find out what all of these acronyms mean is the WebServices Resource Center at http://soap-wrc.com/ webservices/.
A Simple Scenario Let’s look at a hypothetical situation where conversion of an existing application makes sense. You manage a performing arts center. Business is OK, but it’s not great. The board of directors and your investors would like to see higher attendance levels. Advertising might help, but that still doesn’t make it easy for someone to get tickets, especially if they plan to visit your city on vacation. In addition, advertising across the entire country is expensive and there’s no good way to target your audience. Exposing the list of performances and the mechanism for making reservations will help the situation. Travel agencies could include a link to the services you provide to all of their agents with little programming. A travel agent could look at the list of performances and make a reservation for someone going on vacation with little effort. There are two advantages in using this technique. First, every time you make things easier for people, sales will almost certainly increase. Second, getting information to the people who can use it best will also result in increased sales at a low advertising cost. Of course, you wouldn’t want to expose anything other than the list of performances and the ability to make reservations. You might not want to provide the capability to cancel a reservation without a call, for example. That’s why it’s important to decide which parts of your applications to expose using SOAP. You only want to expose the functionality that other people can use, not something that could potentially harm your business. Even if a third party might need to cancel a reservation, forcing them to do so via a telephone call could reduce the potential for cracker activity on your system. In addition, using this method would tend to discourage cancellations.
Differences in Viewpoint Another part of the selection process is choosing modules for the right reason. Distributed application development requires a change in perspective. Squinting can help, but understanding how distributed applications differ from the desktop or client/server variety you’re used to working with is better.
Ch
5
128
Chapter 5
Migrating an Application from DCOM to SOAP
First, you need to consider that you’re no longer relying on a two-tier or three-tier programming model. The Internet isn’t a peer-to-peer programming environment either because an application must store data in a centralized location. Developers refer to the Internet as an n-tier model because clients do some of the processing, while servers handle other parts. A server might need to call upon other servers to handle a request. SOAP is all about exposing services to the Internet community in whole or part. You’ll receive requests for service from a variety of sources, making it difficult to create a picture of activity outside your organization. It’s impossible to make assumptions in this environment. This brings us to the second change in perspective. Huge, monolithic components won’t do the job in the Internet environment. SOAP and other Internet technologies work better when a component has a single well-defined purpose. For example, some companies will place all of their common dialogs in a single component file when working with desktop applications. You would need to using individual, single dialog, files on the Internet. If you can create a component that performs one task well, it becomes easier to create distributed applications. Unlike a desktop application, a distributed application should only load the functionality required to perform a task. A third perspective-changing issue is that SOAP is stateless. You need to create complete packages. Making multiple calls as some DCOM applications do won’t work in the SOAP environment. It’s hard to save state implementation and doing so will lead to significant problems. A fourth problem is trust. Other people will rely on your application if you expose it as a Web service. This means that the components you provide can’t change. Any change in interface will break other people’s code. The application also has to provide 24/7 accessibility. Depending on how many people use your application, you’ll likely find that people are hitting your Web site at all hours of the day and night.
Dividing the Application When you look at your application, some external use needs become obvious early. For example, if you’re converting an order entry system to allow customers to check on the status of their order, then you’ll definitely want to convert the module that allows someone to display an order. However, you don’t want the customer to have ability to change the order in any way, so you wouldn’t make the module that allows order entry accessible externally. The order entry scenario does raise an interesting question. Although you don’t want the customer to change an order using the same methods as employees, you may want to give the customer the choice of canceling orders. This is one place where creating a smaller component that performs one task well comes into play. A client/server developer would place the ability to cancel orders within a large component that contained other features, such as the ability to create new orders. Distributed application development would require you to create several small components to do the work. This order cancellation component will react differently depending on how a client accesses it. Instead of allowing direct access to the main database when working with an external client, you’ll want to create a customer cancellation database that you can check for cancellations later. This allows you to validate the cancellation with the customer before making it
SOAP Application Conversion Prerequisites
129
permanent in the main database. Just remember the principle of creating small components that perform a single task exceptionally well, and you’ll find that distributed application development is easier. Eventually, you’ll end up with a list of coding projects based on the decisions you’ve made about internal and external component access. You can easily divide these projects into four areas: ■
Unmodified components that you won’t change because they work fine as is and no one will access them externally. Be certain you don’t have to duplicate the functionality of these older monolithic components anywhere else in your application. If you do, consider splitting the component into functional parts and reworking it.
■
Modularized components that contain few changes in business or procedural logic. For the most part, you’re splitting a large component into its constituent parts so that you can keep functions required by external processes separate from those that only internal processes will use.
■
Specialized components that react differently based on client. These components require more work to modify. The differences in reaction support security, privacy, or other system requirements.
■
New components that support either DCOM or SOAP services in a unique manner. You’ll find that the vast majority of new components support SOAP functionality. In many cases, these new components provide client-side processing support. This is by far the hardest component programming for the application since the component will need to perform SOAP specific tasks. Low-level API components are much harder to develop than those that rely on SMO or the high-level API.
After you have a project list put together and have the tasks categorized by difficulty, you’ll want to start assigning them to members of your development team. The fact that each component and the interfaces it supports are well defined should allow team members to work in tandem. Of course, you’ll want to perform several levels of integration testing as team members complete their assigned tasks. Creating a critical time line that shows which components to develop first will make it easier to perform integration testing as needed.
Avoiding Protocol-Related Problems in Modified Applications The application you created using DCOM relies on the features provided by that particular protocol. Remember that DCOM provides strong security, the ability to use any data type within the application, and tight integration with the Windows operating system. What this means to you as a developer is that you might run into situations where the assumptions made in the original application conflict the realities of working with SOAP. Unlike DCOM, SOAP doesn’t provide complete access to every data type, comes up short in the security department, and lacks much of a connection with the underlying operating system at all. SOAP also works in a distributed application environment, which means your code might appear in public view. In short, converting an application is likely to cause some problems on both the client and server end.
Ch
5
130
Chapter 5
Migrating an Application from DCOM to SOAP
One of the biggest problems that you’ll run across when working with distributed applications in an environment that isn’t entirely under your control is the use of other languages. The problem is that your application might have to work with code from other countries. The SOAP specification is unclear on the ramifications of foreign-language use and the encoding requirements for your code. However, RFC2376 (http://www.normos.org/ietf/ rfc/rfc2376.txt) and the newer RFC3023 (http://www.normos.org/ietf/rfc/rfc3023.txt) both provide specific encoding requirements. You need to decide what type of encoding to support at the outset of your project. Not including the proper encoding will result in an application that fails unexpectedly when used with other languages.
As of this writing, the Apache toolkit is the best one to use for international applications. It provides the broadest range of international support, including the use of proper encoding in all generated files. The support isn’t perfect, but it’s far better than any other SOAP toolkit available at this time. Since language support is such a hot issue, expect to see vendors update their SOAP toolkits in the near future. Microsoft, in particular, is anxious to update its language support. However, it’s important to verify how the vendors update that support and whether it follows the proper specifications. For example, following RFC3023 is better than following RFC2376 alone. You can still hand edit the output of any toolkit—that’s the beauty of using a text-based protocol.
Of course, language support is just one of many problems you’ll face. The following sections will help you decipher protocol-related problems in your converted application and will tell you how to avoid them. Of course, not every problem is the simple protocol error that it first appears to be. In many cases, you’ll find an underlying problem that wasn’t obvious at first or that only appears in certain circumstances because of branching in your component. In fact, odd errors that only occur when a given set of conditions is present are one of the major concerns for developers. The SOAP protocol is relatively easy to understand, but finding those odd errors that result from branching conditions can prove troublesome.
We’re examining a limited number of SOAP implementations in this book mainly because I would need several books to discuss them all. You might find that none of these SOAP implementations precisely meet your needs and that you would prefer to use something else. Finding these other SOAP implementations could prove difficult, even with the extensive search capabilities of the Internet. Fortunately, someone has already done part of the legwork for you by creating a list of current SOAP implementations at http://www.soapware.org/ directory/4/implementations.
Server-Side Component Assumptions Not every protocol problem you run across will resolve itself to something within the SOAP message, runtime files, or associated components. Some developers have noted odd problems with their SOAP implementations. For example, one developer recently noted that when using the high-level API calculator example in the Microsoft SOAP Toolkit, he would get an
SOAP Application Conversion Prerequisites
131
“Object Required” error message. Yet, if he commented the error parsing code out of the listener, the example would work. He then noted that he had made a few small changes to the example for the sake of experimentation. The problem turned out to be one where the error occurred within the component after it had already generated the correct response. Although the problem appeared to be within the SOAP portion of the application, it actually resided in the component in a location that wasn’t immediately apparent from the error message provided. The problems with all of the connectivity that SOAP requires is that an error can hide. You might think the error is in one location when it actually appears in another. For this reason, you’ll want to know how to test your application using several techniques to ensure you can view problems from several perspectives. That’s one of the reasons why I’ve provided so many test Web sites in previous chapters. The developer in this scenario could have avoided the problem by testing the component locally before deploying it on the server. He could have also tried using the component with other protocols to see if the problem actually occurred within the protocol or as part of the application component. Using these two tests would have avoided many hours of troubleshooting time because the developer would have seen the error locally. In short, the best way to avoid some protocol-related problems is to perform proper module level testing. Another problem is that SOAP is actually a message formatting specification and nothing more. Developers can assume many things about the SOAP specification, most of which aren’t part of the specification at all. For example, SOAP doesn’t provide transaction support and it won’t provide transaction support anytime in the future. If you need transaction support for your application, then you need to choose a transport protocol that supports transactions such as DCOM or CORBA. The combination of protocols you choose is important because the serverside component will rely on these protocols to deliver all of the required information. Even if the application delivers a correct SOAP message, that message might not arrive at the component if the application doesn’t use all the required protocols to deliver essential information.
Error Handling Code Moving your error handling code to a special module designed for the process may seem like overkill, but this is one area where a lot of developers run into trouble. DCOM provides one set of error handling resources; SOAP provides another. You can’t use a single set of error handling routines to handle both protocols so it’s best to place them in separate modules. Let’s look at a typical example of the differences between SOAP and DCOM when it comes to error handling. Here is a short module designed to throw an error. Notice that I’m not using anything odd in the way of code. Public Function ThrowError(myInput As String) As String ‘Generate some quick output. Dim myOutput As String myOutput = “There was an error! Bad string: “ + myInput ThrowError = myOutput ‘Create an error message. Err.Clear Err.Raise 100, App.Title + “.GenError.ThrowError”, “This is an error!” End Function
Ch
5
132
Chapter 5
Migrating an Application from DCOM to SOAP
You’ll find all of the sample code in this chapter in the Chapter 5 folder on the Que Web site for this book. You can find it at www.quepublishing.com. The examples contain more code than shown in the book—the listings show only the code required for explanation purposes.
You’ll notice that the example sets the function to a value, and then throws an error. Theoretically, the only thing that we’ll see as output is the error information. Here’s the code that I’m using on the client side. The error handling is a little more extensive than before because I want to show you how much of the error information actually arrives from the component. In other words, we need to answer the question whether SOAP hides anything. Private Sub cmdThrowError_Click() ‘Create the SOAP client. Dim Client As SoapClient Dim Result As String Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/ShowError/ErrorSim.WSDL”, _ “ErrorSim”, _ “GenErrorSoapPort” ‘Initiate the error. Result = Client.ThrowError(“Hello World”) MsgBox Result, vbOK, “Error Message” Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + Client.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + Client.faultstring + vbCrLf + vbCrLf + _ “Fault Actor: “ + Client.faultactor + vbCrLf + vbCrLf + _ “Detail: “ + Client.detail MsgBox ErrorMessage, vbExclamation End Sub
This code is similar to the client code I created in Chapter 4. The main difference is that we’ll look at all the potential fault information. The application formats the output so that you can best see the available information. Figure 5.1 shows the output from this example.
SOAP Application Conversion Prerequisites
133
Figure 5.1 The fault information output from this application isn’t complete.
Look at the error generation code in the component listing and compare it to the output shown in Figure 5.1. You’ll notice that you didn’t receive the error number. If this had been a DCOM application, you would have seen an error number that might provide more information than the text. Figure 5.2 shows the data flow between client and server (I’m using the Microsoft SOAP Trace Utility in this case.) Figure 5.2 The fault information from a component is much more detailed than from a SOAP error.
As you can see, there’s a lot of information in the message, but the error number doesn’t match the one output by the application. The fact that SOAP outputs a lot more information for a component related failure is great, but it doesn’t replace the information that’s missing. This is one of the reasons you’ll want to use multiple error handling routines. A SOAP application component will need to output more information as part of the description than your DCOM components will. Notice that the output does contain the name of the component as the application name. That’s the effect of the App.Title entry in the
Ch
5
134
Chapter 5
Migrating an Application from DCOM to SOAP
component code. You should always provide this information so that you can track down multilevel component failures with a little less trouble. Of course, using multiple help routines raises another set of problems. You need to consider how the component will know which error handling routine to call if you create a mixed protocol application. The technique that I rely on is to add another argument to the component. This argument remains undefined when DCOM calls the component, but contains a simple text value when called by a SOAP listener. In this way, all the new code resides with the SOAP listener and the DCOM client application continues to operate essentially as it did in the past.
I’ve mentioned many times the SOAP specification is changing as I write this and will probably continue to change for quite some time. At least some of these changes are the result of errors or omissions within the current specification. In other words, don’t always assume that error messages you see are problems in your code—they may be failings of the specification. You can find the current SOAP specification problems at http://www.w3.org/2000/xp/Group/xmlp-issues. You’ll also want to consider interoperability issues between toolkit implementations of SOAP. Find out more at http://www.soapware.org/ interopathonPlan.
Some modules require modification simply to accommodate differences in protocol implementation. We’ll see later that you should create separate error handling modules for a mixed mode application. The requirements of SOAP error handling are quite different from those needed for DCOM. Since an application developer normally assumes a single error handling technique, you’ll find that you need to create these separate modules as new entities and find some consistent method for accessing them. In short, you need to consider protocol requirements when determining which modules to modify. It’s important to ask the question of how the new protocol, SOAP, acts differently than the old protocol.
Integrating New Modules with Existing Application Elements The objective of any programming exercise (at least the eventual goal) is to create an endto-end coherent application that performs all the tasks the users request of it. In the case of application modification, you begin with a fully functional application and end up with a fully functional application that has new features. In between these two stages you have various stages of dual application development. In other words, you’re actually creating two applications: one that works with SOAP and another that works with DCOM. So, how do you finally create one application from two? Developers take many approaches to the problem. The method that seems to work best is to begin with the initial application and add task-related functionality to it. We’ll see how this works later in the three example applications. The concept is simple, but the implementation can be difficult. Individual component testing can also help. Ensuring that each component is bug free before trying to integrate it with the larger application is important, especially with SOAP because
SOAP Application Conversion Prerequisites
135
there are many outside factors that can affect test results. To give you some idea of how bad this problem can be, one developer I talked with spent the better part of two days trying to diagnose a problem with an application. His thought was that since the component was so simple, the problem had to be with the network setup or some other factor. Even looking at the code didn’t seem to help, in part because he was so convinced the problem had to be elsewhere. A local test, however, showed that the component was faulty. If the developer had tested the component locally before integration, this problem wouldn’t have happened. Testing the individual component with a reliable transport on a test LAN is also helpful. In this way, you can eliminate network specific errors. A component that works locally may not work across the network. I normally test my component with DCOM first, because I’m familiar with errors that DCOM will present. I then test locally using SOAP. Remember that you won’t fully test the component with DCOM because there are some SOAP-specific issues to address in the code. This may seem like a lot of additional work. Developers under a deadline (who isn’t today) often skip these intermediate steps hoping the component will work the first time. More often than not, skipping the steps only wastes time because the developer ends up doing them again anyway. Integrating new modules into an existing application can be tricky for other reasons. The most important consideration is production system downtime. You want to keep downtime to a minimum. All of these intermediate steps allow you to test the component fully outside the production environment. When you do add the new component to the system, the users should see added functionality and nothing else. Security is also an issue when adding new components to an existing application. You don’t want everyone to try the component at the same time. The testing phase should include a timeframe in which experienced users test the functionality provided by the new component and provide comment on how well it integrates with the system as a whole. Often, a component works completely from a programmer’s perspective, but breaks the application from the user’s perspective because of difference in workflow. SOAP applications work in a distributed environment, so make sure you test the application in every environment and use every device. This is something that DCOM developers might not have worried about in the past because a DCOM application usually resides on a desktop client. This final piece of the puzzle, user perspective, is actually the most difficult part of the integration. Some developers try to handle it by telling users that this is the new way the application works and they can’t do anything about it. One developer went so far as to restrict a user’s ability to handle commands at the keyboard, rather than use a mouse, just to deliver an application on time. The company didn’t accept the application because of user complaints about the change in workflow. Once the experienced test group completes their testing of the new functionality, open it up to the company as a whole, at least those users who actually need to use the new application features. This last phase of testing should go smoothly if you’ve performed all the required intermediate steps. However, you should still solicit responses from the users, especially if
Ch
5
136
Chapter 5
Migrating an Application from DCOM to SOAP
your application will run on new types of clients such as PDAs. (We’ll look at a PDA example in Chapter 10, “Working with PDAs.”) In some cases, you’ll find that you need to make changes to the interface to accommodate these new device types—devices that you haven’t had to worry about when working with DCOM.
Implementing SOAP with COM Language Binding SOAP transfers data from one point to another. It’s nothing more than a messenger when you think about it. The capabilities of SOAP are useless if the message is garbled or in a format that the recipient can’t understand. The process of binding allows the client to discover the message format required by the server to exchange information. SOAP provides the conduit required for this discovery phase. Of course, there are a number of ways to provide binding services using SOAP. The SOAP specification doesn’t say much itself and leaves this detail up to the implementers. Microsoft and other companies choose to use WSDL as the discovery mechanism for SOAP. When you write the client portion of an application (as we did in Chapter 4), the application first queries the WSDL file to discover the message format required by the server. When you follow the flow of information from client to server, you’ll discover that the server sends the WSDL file to the client so that the client can use it for message formatting. (Make sure you read the sidebar, “Is WSDL Sufficient?” to discover some additional thoughts about this particular method of binding the component to the client application.) Is WSDL Sufficient? We looked at a simple SOAP example in Chapter 4 and used a Web Services Description Language (WSDL) file to define the component interface for that example. Just about every component-based application today relies on some type of Interface Definition Language (IDL) to describe the component’s construction. WSDL seems to work fine in the Chapter 4 example, but the component used provides such a simple interface that we probably didn’t need even WSDL to define it. However, what happens when you begin working with complex components or need to convert applications as we’re doing in this chapter? Does WSDL provide a sufficiently robust specification that it can handle these situations? Some developers are asking these questions as they work with WSDL. The WSDL specification is relatively new. Vendors designed the WSDL specification to answer the problem of working with multiple proprietary IDL specifications such as Service Description Language (SDL), Network-Accessible Service Specification Language (NASSL), SOAP Contract Language (SCL), and SOAP Interface Definition Language (SIDL). The short answer to the question of whether WSDL is sufficient is no. It won’t handle all the needs of a developer today and there’s a lot of room for interpretation in the specification. Because of these problems, some developers are proposing yet more specifications, such as A Little Interface Definition Language (ALIDL). You can read about this specification at http://www.xmlrpc.com/alidl. Of course, WSDL also has competition in the form of the established Universal Description, Discovery, and Integration (UDDI) specification. The long view of technology is different, however. A developer today needs to wade through vast numbers of specifications because the technology is in its infancy. As SOAP and its associated technologies mature and developers have a larger knowledge base with which to make decisions, the current problems with WSDL should resolve themselves. In the end, developers need to choose a technology and use it until the technology matures or a truly better technology comes along. WSDL seems to provide the best IDL available for SOAP today—it’s the right choice for most development projects.
SOAP Application Conversion Prerequisites
137
WSDL isn’t the only method for providing SOAP binding. Many developers prefer to leave nothing to chance and address the issue as a design decision. The SOAP message itself will always provide sufficient information for binding to take place. Of course, this requires an intimate knowledge of the application as a whole and provides a static solution to the problem. You’ll see in Appendix A, “SOAP Data Types and Data Type Conversions,” that this decision also affects the way data types are handled and can even change the performance characteristics of the application. Some products, such as 4S4C (see Appendix D, “SOAP for Visual C++ Developers,” for details) directly query the component and provide binding in that manner. This is reminiscent of the method used by DCOM and will be the most familiar to Visual C++ developers who have had to worry about binding details in the past. Few vendors are embracing this methodology, but it’s an important method to consider if you have a large, well-established, application. No matter which method you use to bind the client to the server, SOAP will act as the messenger. The method used to perform the binding will affect the SOAP message format, but the result is the same—a client talking to a server and gaining access to resources. Vendors normally choose a binding method and don’t provide a means to modify that choice. As a result, you need to consider the choice of toolkit wisely to avoid problems with the current application setup.
Productivity Tips If you’re in the same position that most DCOM developers are, the company president is breathing down your neck asking every hour about the progress you’ve made in converting the company applications for Internet use. It doesn’t matter that it took years to write those applications—the president wants them converted today in order to keep up with the competition. You need to increase productivity as quickly as possible. This section of the chapter won’t include all of the productivity enhancements I’ve researched. You’ll find them spread throughout the book in the appropriate places. The purpose of this section is to look at some productivity aids for those people who are converting applications. Combine your favorite tips in the following sections with those in the rest of the book to come up with some techniques that work best for you.
Quick Research Chapters 1, “An Overview of SOAP,” and 2, “SOAP in Theory,” provided you with some ideas on how a typical SOAP message is put together. In Chapter 4 we looked at how applications create SOAP messages to transfer data from one place to another. However, you might not have time to read all of that information every time you want to find out the meaning behind a particular SOAP element. You might need the information now. That’s where the SOAP 1.1 Reference Web site (http://www.zvon.org/xxl/soapReference/Output/ index.html) comes into play. This Web site divides the SOAP message into easily recognizable pieces and allows you to read about that element only. Figure 5.3 shows what this Web site looks like.
Ch
5
138
Chapter 5
Migrating an Application from DCOM to SOAP
Figure 5.3 The SOAP 1.1 Reference Web site is the best place to find information about SOAP quickly.
As you can see, all you need to do is choose a SOAP message element in the left pane to see information about it in the right. You can also click on keywords within the description, such as the encodingStyle keyword shown in Figure 5.3. This Web site only provides a quick look at SOAP elements, however. If you want to learn detailed SOAP information, you still have to rely on the SOAP specification or detailed write-ups such as those found in Chapters 1 and 2. Clicking the Go to standard link on this Web page takes you to the description of that particular element in the standard. It will still require time to read about the element, but at least you won’t spend time searching the specification for it. While the SOAP 1.1 Reference Web site won’t save you astronomical amounts of time, it does help in a world where every second counts. You can find the information you need, written in a very terse style, extremely fast. This is the place to go when you’re in a hurry and don’t care to know all of the details about a particular SOAP element.
Improving Your IDE One of the best productivity enhancements you can perform is to improve the usability of your IDE. Many developers make their special tools available online—free for the price of a download. One of the better tools that I’ve found for Visual Basic is the MZ-Tools add-in (http://www.mztools.com/). This particular tool enhances productivity by adding a Main menu to the Visual Basic IDE, along with several context menus. We’ll explore this tool in detail in Appendix C, “Third Party Tool Reference.”
Researching the WSDL Files You’ll have to create WSDL files for all of your existing components and any new ones required for SOAP use. Studying these WSDL files can often point out flaws in an implementation that you might not otherwise see. Let’s look at a specific example.
SOAP Application Conversion Prerequisites
139
Figure 5.4 shows the WSDL file for the AddIt application in Chapter 4. Notice that the application passes the two numbers used for the addition both to and from the server. While this small amount of data won’t cause a big performance drop when using DCOM, it could cause a significant problem when working with SOAP over an Internet connection. Figure 5.4 The AddIt component WSDL file shows an implementation flaw.
Ch
5
All we really need for a return value is the result. The client already knows the two numbers used to create the result and there isn’t any reason for the server to modify them. Here are the simple changes you’d make to the application to fix this problem (the changes are in bold type). Public Function DoAdd(ByVal Add1 As Integer, _ ByVal Add2 As Integer) As Integer ‘Calculate the result. DoAdd = Add1 + Add2 End Function
No, it’s not a very big change, but it can make a difference in the operation of your SOAP application. Figure 5.5 shows the new WSDL file for this example. You may not see a very big difference with this example, but it can make a huge difference when you have several hundred users all hitting your database application at once. Even small performance gains can make the difference between an application that users like and one that gets complaints. The bottom line is that researching the WSDL files can help you determine that the interfaces you think you have in place are the ones that the application will actually use. Verification of the current state of your DCOM application is an important part of the conversion process. Using this WSDL technique can save hours of looking at source code because it provides you with an encapsulated view of your application in an easy-to understand format. Theoretically, you could even build an application to parse the WSDL files
140
Chapter 5
Migrating an Application from DCOM to SOAP
and verify them against your application design, saving even more time. (I imagine a thirdparty developer will eventually create an application just like the one I’m talking about here, so keep your eyes open.) Figure 5.5 We fixed the flaw with a small coding change.
Updating a Simple Utility Program Utility programs come in a number of shapes and sizes, and people invent new forms of utility programs every day. Microsoft recently introduced yet another version of the utility program, the Web Service. No matter which utility program you’re talking about, they all seem to have several common elements. ■
Small size
■
Not a main application
■
Easy to understand and use interface, or no interface at all
■
Only one or two modules with few interfaces
■
Normally used by savvy users
This section of the chapter will look at a simple utility that you might want to convert from DCOM to SOAP given today’s distributed environment. This utility simply polls the server for the current time and date. (We’ll extend this with the capability to poll service status in Chapter 6, “Creating Remote Access Utilities.”) Sure, you can get this information using other methods, but having an agent on every server that sends the information back to a small utility can be quite useful, especially if you have more than one type of server.
Updating a Simple Utility Program
141
We talked about the need to use WSDL files with the Microsoft SOAP Toolkit in Chapter 4. The toolkit comes with two utilities you can use to create WSDL files. The problem with these utilities is that they generate UTF-16 files that may not work in every situation. You can find another fully functional WSDL Generator at http://www.phalanxsys.com/soap/wsdlwiz.htm. I’ll also discuss this utility in Appendix C. Another good WSDL toolkit candidate is the one from alphaWorks that you’ll find at http://www.alphaworks.ibm.com/tech/wsdltoolkit.
Updating the Server-Side Component DCOM components start life as EXE files. That’s because Windows executes them as outof-process components. An out-of-process server doesn’t rely on anything else to work. All you need to do is create a registry reference to it and then access it from within a client application. The local registry reference points to the server that contains the components, while the remote registry reference contains the normal out-of-process server entries. Ch
The Que Web site contains all of the code used in this example. You can find it at www.quepublishing.com. You’ll find the DCOM client and component that I used as a starting point in the \Chapter 05\DCOM Utility Client and Chapter 05\DCOM Utility Component folders. The converted client and server appear in the \Chapter 05\SOAP Utility Client, \Chapter 05\SOAP Utility Local Component, and \Chapter 05\SOAP Utility Component folders.
You can begin the conversion process for a Visual Basic project by opening the Project Properties dialog box shown in Figure 5.6. Change the project type to ActiveX DLL and check the Unattended Operation option. These two changes are all you need in many cases to convert the project from an out-of-process to an in-process server. Visual C++ users will want to create a new project and copy the class specific code. In either case, create the DLL form of your component and move it to the server. Figure 5.6 Modify the Project Properties dialog to match the new component requirements.
5
142
Chapter 5
Migrating an Application from DCOM to SOAP
In some cases, the component won’t compile correctly, especially if it references other components. Visual C++ developers will experience this problem more often than Visual Basic developers. Remember that all references are from an in-process server now. This places limits on what you can do with the component from a memory usage perspective since the component won’t have its own memory to use. You also have to check threading issues. The change from out-of-process to in-process means that you may be able to squeeze a little extra performance out of the component by changing the threading model to match the host. Theoretically, you won’t have to consider any additional thread-safety issues if the component is already thread safe in its EXE format. However, some DCOM developers take shortcuts with thread safety, so you may run into problems with this issue as well. Of course, you’ll need to generate the requisite WSDL file for your server. I find that the SOAP Toolkit 2.0 Wizard dialog shown in Figure 5.7 provides a good overview of problematic methods. It highlights any methods with problem data types in red. If you see any problem methods, you may have to choose a different data type for your component method and upgrade the client as well. These problem methods will also generate question marks, “????????” in the WSDL file. If you don’t see any red methods, then generate a WSDL file and check it for problem methods. I talked about this process in the “Researching the WSDL Files” section of the chapter. Figure 5.7 The SOAP Toolkit 2.0 Wizard dialog will tell you about any problem methods in your component.
Creating the Local Component The client application is still expecting a direct component reference. The first method for handling this problem is to change the client to use the WSDL file. You’d use code similar to the code found in Chapter 4 to go this route. The second method is to use a local component that accesses the remote component using SOAP.
Updating a Simple Utility Program
143
You can use two different techniques to create the local component. The first method is to use the SOAP Messaging Object (SMO) technique. I introduced the Microsoft SOAP Messaging Object Generator in the “Microsoft SOAP Messaging Object Generator” section of chapter 4. Don’t get the idea that this method is a free ride since the wizard creates a lot of the code for you. It does greatly reduce the amount of work, but you’ll still need to perform a lot of coding. SMO is more suitable for database application development, so we’ll look at it later in the book. The second method is to create a local component that stands in for the remote component that you used in the past. The local component will access the remote component using SOAP. That’s the technique most suited to an existing DCOM application. If you give the local component the same name and use the same interfaces as you did before, the client application might not require much in the way of update. Best of all, except for a local registration that you could perform using a script when the user logs into the server, you don’t have any messy updates to perform. Begin by creating an ActiveX DLL or ActiveX EXE application. You may be able to gain a speed advantage using an ActiveX DLL, but the application update process is simpler using an ActiveX EXE. I normally rely on an ActiveX EXE to make things simple. Here’s the code you’ll add to the project. Public Function GetTime() As String ‘Create the SOAP client. Dim Client As SoapClient ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/DCOMUpdate1/CheckStat1.WSDL”, _ “CheckStat1”, _ “CheckServerSoapPort” ‘Get the status information. GetTime = Client.GetTime Exit Function ‘Display a message when an error occurs. ErrorHandler: ‘Create the error message. Dim ErrorMessage As String ErrorMessage = “Fault Code: “ + Client.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + Client.faultstring + vbCrLf + vbCrLf + _ “Fault Actor: “ + Client.faultactor ‘Pass the error back to the client. Err.Raise 0, “SOAP Processing Error”, ErrorMessage End Function
Ch
5
144
Chapter 5
Migrating an Application from DCOM to SOAP
Notice that the code is similar to what we used for the client application in Chapter 4. You still have to create a SoapClient object and use it to provide the remote access. Obviously, you can use any other technique supported by the Microsoft SOAP Toolkit as well, but this method works well for DCOM upgrade projects where you don’t have a lot of data concerns. Of course, instead of displaying the results within the component, you’ll send them along to the client. Also, notice that this example provides more error handling information to the client, but that the error number is still set to 0 because we can’t determine it from the SOAP message. The example uses the same project and class name as the DCOM project that I updated. In fact, if you look in the DCOM Configuration Utility (Figure 5.8), you’ll see two copies of the component now. This brings up an important point. Make sure your registration script for the new local component also contains code to unregister the old remote component. Otherwise, the user will end up with ambiguous registry entries and you might end up with additional work in the form of support calls. Figure 5.8 The DCOM Configuration Utility will show two versions of the same component.
Updating the Standard Client Given the method that we’ve used to create the local component, updating the client is easy. All you need to do is change the reference within the Visual Basic application. I gave the local component a distinctive name so that it would be hard to confuse it with the old remote component. Figure 5.9 shows the one and only change to the updated DCOM client. At this point, all you need to do is compile the client and distribute the new copy to users. Theoretically, you could carefully code the local component so that you wouldn’t need to change the client at all. What this means is ensuring the local component has the same GUID as the old remote component did. You’ll also need to make the appropriate registry changes so the client uses the local component, rather than looks for the remote component.
Updating a Data Viewer
145
Figure 5.9 Using a local component makes updating the DCOM client easy.
However, this practice is for advanced programmers only—those with nerves of steel who are willing to take the chance the application won’t work at all. Most developers will take the additional step required to update the reference and recompile the application.
Updating a Data Viewer Many people interact with database applications on the Internet, but only to gather information. A potential customer might want to look at the stock you have to sell or a partner might want to learn the status of a project. Some people will just want to browse any free information your company can provide. No matter what the reason, there are lots more data viewer applications than full-fledged database management applications out there today. While you may eventually upgrade the database management application used by employees, it’s almost certain that your first step will be to create a data viewer. Figure 5.10 shows the design diagram for a SQL Server database that I created. It’s a simple multitable example and we’ll explore it in more detail later. All you need to know now is that this database contains a Client table that some of the sales staff want to look at while on the road. I have a DCOM application that views the data in this table and we’ll convert it to a SOAP application in this section. As with every other example in this chapter, you’ll find the source code on the Que Web site for this book. You can find it at www.quepublishing.com. Before we actually perform the conversion, however, you might want to know a little more about the three-tier programming concept. Many developers still use the client/server programming model, but that won’t work well with SOAP. You need the additional features that a three-tier application can provide. The remaining sections help you understand the process for moving a data viewing application from the LAN to the distributed environment of the Internet.
Ch
5
146
Chapter 5
Migrating an Application from DCOM to SOAP
Figure 5.10 A simple multitable database provides the basis for the data viewer application.
Understanding the Importance of Three-Tier Programming You could actually replace the three-tier in the title for this section with “multi-tier” or “n-tier,” but most database documentation refers to it as the three-tier model. The three tiers correspond to the client, business logic, and back-end processing portions of such an application. The three-tier method of creating database applications is especially important for SOAP developers because that middle tier provides the buffering needed for secure data transmissions. You don’t want to expose your SQL Server setup to the world at large. There’s a lot of misunderstanding about the role of three-tier development today and its affect on the developer. In the past, a client would send data requests to a server. The server would answer back. This is the client/server model. It’s not limited to two machines—there can be as many servers as required to get the job done. The limitation is in the number of levels of communication. When using the client/server model, the client requests data directly from the server, which infers that the client knows the identity of the server. In addition, the server can’t request data from another server on the client’s behalf—the client and the server always have a direct relationship. Unfortunately, client/server won’t work very well in the distributed computing environment. As server farms become more prevalent on large networks, it’s not always possible for the application to know which server will fulfill a request. Load balancing and other concerns make it possible that one server will service a client during one request and another server when making a second request. This is especially true with SOAP where each client request is another session. In fact, it’s possible that a server will go offline and hand the request to
Updating a Data Viewer
147
another server mid-way through the transaction. Given the realities of SOAP and the distributed programming environment, direct contact is no longer feasible. Another problem with client/server is that the client has to directly request every piece of data it needs, which means the client has to have a lot of intelligence. Adding intelligence to the client, rather than the server, means writing a lot of redundant code. In a distributed environment, the client should only need to worry about making an initial request for data. It’s up to the server to figure out how to get that data for the client. If the server has to call on several other servers to fulfill the request, the client shouldn’t be aware of the background transactions taking place. Building an application using this technique means that the server can handle a wider range of clients, all of which use a similar request format. Consider, for a moment, that the developers of SOAP wanted to provide an easy to use protocol for distributed component use and discovery by a wide range of platforms. In this new environment, anything from a PDA to a Linux server could access your system in search of data. You can’t assume anything about the client, which means the intelligence for an application must reside on the server. While the client is responsible for the presentation of the data (unlike in the days of dumb terminals), the server is responsible for gathering the data in an easily recognizable (documented) format. It’s important to understand that three-tier applications can work across many kinds of boundaries because the request format is always the same. A thin client using a browser as a container will request data using the same request format as a thick client operating on the desktop. The server no longer needs to know what kind of client is making the request, only that the client needs data and resources that the server can provide. In short, three-tier computing and the distributed environment both require self-contained servers and clients, each of which only knows about the data and objects that are being exchanged. As you can see, three-tier computing isn’t about specific roles—it’s about modularity and flexibility across vast distances. Unlike client/server, which vendors designed for use in a LAN environment, three-tier applications are designed to work in the WAN or MAN environment, where you can’t assume anything about the connection between the client and server.
Updating the Common Business Logic Component The DCOM application uses a simple server-side component in this case to serve up data in the form of records. Here’s a sample of the code from the component. Public Sub GetClient(ClientData As ADODB.Recordset) ‘Open the recordset. deClient.rsClient.Open ‘Make the data accessible the client. Set ClientData = deClient.rsClient ‘Close the recordset deClient.rsClient.Close End Sub
Ch
5
148
Chapter 5
Migrating an Application from DCOM to SOAP
You should notice an immediate problem with this code. SOAP can’t handle the recordset data type. Another problem is that this component assumes the client will process the data. Of course, that’s another problem with SOAP, you can’t assume anything about the client. A PDA may not have the memory required to handle an entire recordset, much less manipulate it. The component logic is simple in this case—there isn’t much you can do to simplify it. However, the client side of the equation is quite complex, so that’s where you would look for areas of simplification in this case. What we really need to do is leave the DCOM application alone and create a new component that performs some of the work that we expect the DCOM client to do. In short, we need to move some of the viewing logic from the client to a new server-side component. Given the format of the data shown in Figure 5.10, we can probably pass all of the information to the client in the form of strings after processing.
Separating the Data Viewing Logic from the Main Database Component The SOAP component is actually a new server-side DLL in this example. Depending on how you set your database application up, however, the server-side DLL could contain a lot more than just viewing logic. We’ll need methods that return the next and previous records. The client will keep track of it’s current position in the database by returning the Customer ID value along with the next method invocation. We’ll talk in the “Updating a Complete Database Application” section about the need to maintain state information for database applications—this is one such technique. Now that we have some of the preliminaries out of the way, let’s talk about some code. Listing 5.1 shows the code that I created for the SOAP server-side DLL for this example. Remember that most of this code used to appear in the client application; we just moved it to the server.
Listing 5.1
A Server-Side Component for SOAP Use
‘Create the required components Dim oOrderModify As ViewData.ViewClient Dim rsCatalog As ADODB.Recordset ‘Create the data exchange variables. Dim CustomerID As String Dim FirstName As String Dim MiddleInitial As String Dim LastName As String Dim Title As String Dim Company As String Dim Address1 As String Dim Address2 As String Dim City As String Dim State As String Dim ZIP As String Dim Telephone1 As String Dim Telephone2 As String Option Explicit
Updating a Data Viewer
Listing 5.1
149
Continued
Private Sub PerformUpdate() ‘Make sure we can display the record. If Not (rsCatalog.EOF Or rsCatalog.BOF) Then ‘Gather the information. CustomerID = rsCatalog.Fields(0).Value FirstName = rsCatalog.Fields(1).Value If Not IsNull(rsCatalog.Fields(2).Value) Then MiddleInitial = rsCatalog.Fields(2).Value Else MiddleInitial = “ “ End If LastName = rsCatalog.Fields(3).Value If IsNull(rsCatalog.Fields(4).Value) Then Title = “N/A” Else Title = rsCatalog.Fields(4).Value End If If IsNull(rsCatalog.Fields(5).Value) Then Company = “N/A” Else Company = rsCatalog.Fields(5).Value End If Address1 = rsCatalog.Fields(6).Value If IsNull(rsCatalog.Fields(7).Value) Then Address2 = “N/A” Else Address2 = rsCatalog.Fields(7).Value End If City = rsCatalog.Fields(8).Value State = rsCatalog.Fields(9).Value ZIP = rsCatalog.Fields(10).Value If Not IsNull(rsCatalog.Fields(12).Value) Then Telephone1 = rsCatalog.Fields(12).Value End If If Not IsNull(rsCatalog.Fields(13).Value) Then Telephone2 = rsCatalog.Fields(13).Value End If End If End Sub Sub GetNext(sCustomerID As String, _ sFirstName As String, _
Ch
5
150
Chapter 5
Migrating an Application from DCOM to SOAP
Listing 5.1
Continued sMiddleInitial As String, _ sLastName As String, _ sTitle As String, _ sCompany As String, _ sAddress1 As String, _ sAddress2 As String, _ sCity As String, _ sState As String, _ sZIP As String, _ sTelephone1 As String, _ sTelephone2 As String)
‘Open the recordset. LoadData ‘Search for the current record. Dim Search As String Search = “CustomerID=” + sCustomerID rsCatalog.Find Search ‘Make sure there is a next record. If Not rsCatalog.EOF Then ‘Go to the next record. rsCatalog.MoveNext PerformUpdate ‘Get the data. sCustomerID = CustomerID sFirstName = FirstName sMiddleInitial = MiddleInitial sLastName = LastName sTitle = Title sCompany = Company sAddress1 = Address1 sAddress2 = Address2 sCity = City sState = State sZIP = ZIP sTelephone1 = Telephone1 sTelephone2 = Telephone2 End If ‘Close the recordset. UnloadData End Sub Private Sub LoadData() ‘Instantiate the DCOM component. Set oOrderModify = New ViewData.ViewClient ‘Gain access to the recordset. oOrderModify.GetClient rsCatalog
Updating a Data Viewer
151
Listing 5.1 Continued ‘Open the recordset. rsCatalog.Open End Sub Private Sub UnloadData() ‘Close the recordset. rsCatalog.Close ‘Clean up the objects. Set oOrderModify = Nothing Set rsCatalog = Nothing End Sub
This code is somewhat truncated from the original on the Que Web site for this book. You can find the full version at www.quepublishing.com. In addition to the GetNext() method, there’s also a GetFirst() and a GetPrevious() method. All three methods work in essentially the same manner. The one difference is that the GetFirst() method doesn’t need to search for a record. Note the technique used to create the PerformUpdate() method. Every field that isn’t part of the primary key or required in some other way has an alternate value. If you don’t include code like this, then SOAP will send an error message to the client that it received data that wasn’t marked “text/xml.” In fact, SOAP will often resort to this message when working with database applications, leaving you without any clue as to the source of the problem. Robust error handling isn’t an option when working with SOAP because it presents messages that are even more ambiguous than the normal variety. The overall process for this code is that the method creates an instance of the original component. This, in turn, opens the database for use. The method requests the data for the current record, formats it as strings, and passes the information to the client. The method call ends by closing the database and releasing locally created objects.
Creating the Data Viewer Client The use of SOAP and an intermediate component simplifies the client. All the client needs to do is create a connection to the server-side SOAP component, get the data in the form of strings, and display it on screen. Listing 5.2 shows what the client code looks like for this example (the code is again cut for the sake of brevity—check the Que Web site at www.quepublishing.com for a full copy of the source code).
Listing 5.2
Data-Viewer Client Code
‘Create the SOAP client. Dim Client As SoapClient ‘Create the data storage variables. Dim CustomerID As String
Ch
5
152
Chapter 5
Migrating an Application from DCOM to SOAP
Listing 5.2 Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim
Continued
FirstName As String MiddleInitial As String LastName As String Title As String Company As String Address1 As String Address2 As String City As String State As String ZIP As String Telephone1 As String Telephone2 As String
Private Sub cmdNext_Click() ‘Set up an error handler. On Error GoTo ErrorHandler ‘Get the customer data and display it. Client.GetNext CustomerID, _ FirstName, _ MiddleInitial, _ LastName, _ Title, _ Company, _ Address1, _ Address2, _ City, _ State, _ ZIP, _ Telephone1, _ Telephone2 ‘Display the result. DisplayResult Exit Sub ‘Display a message when an error occurs. ErrorHandler: MsgBox Client.faultstring, vbExclamation End Sub Private Sub cmdQuit_Click() ‘Exit the Application End End Sub Private Sub Form_Load() ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/DCOMUpdate2/DataFormat.WSDL”, _ “DataFormat”, _ “ClientViewSoapPort”
Updating a Complete Database Application
Listing 5.2
153
Continued
End Sub Private Sub DisplayResult() ‘Display the result txtCustomerID.Text = CustomerID txtFirstName.Text = FirstName txtMiddleInitial.Text = MiddleInitial txtLastName.Text = LastName txtTitle = Title txtCompany = Company txtAddress1 = Address1 txtAddress2 = Address2 txtCity = City txtState = State txtZIP = ZIP txtTelephone1 = Telephone1 txtTelephone2 = Telephone2 End Sub
Ch
As you can see, this code is just an extension of the client code we viewed earlier. The main differences are that I’ve placed some of the code in centralized procedures to accommodate the needs of all of the methods. The cmdFirst_Click() and cmdPrevious_Click() methods work precisely the same as the cmdNext_Click() shown in the listing. The code required to create a connection is executed automatically when the form loads, which can save time.
Updating a Complete Database Application In the previous two sections, you’ve learned how to convert existing DCOM applications. We’ve used both an ASP and an ISAPI Listener approach in these two sections. These two examples provide you with a good understanding of the DCOM part of the picture—what you need to do in order to get an existing application running as quickly as possible under SOAP. Chapter 8, “Providing Remote Database Access,” is going to show you how to create a fullfledged database management example using a third-party product. In some respects, the Microsoft SOAP Toolkit is still very much a diamond in the rough. I decided that another toolkit is better suited to working with SOAP after a lot of consideration. While this is true today, you may find that Microsoft has upgraded their product by the time you read this—which means you may not need a third-party product to create fully functional database applications.
The Microsoft SOAP Toolkit doesn’t provide any means to validate the schema input from a client against the schema within the WSDL file. This means that a client could pass an invalid schema and you wouldn’t know that it was invalid until the server failed. Unfortunately, a failure at this point masks the true source of the problem unless you include special code within the component that manually validates the schema. Microsoft plans to add validation to a future version of the toolkit. In the meantime, you’ll need to check for schema related problems using additional code.
5
154
Chapter 5
Migrating an Application from DCOM to SOAP
This section of the chapter is going to look at some principles you need to understand in order to convert a DCOM database application into something that will work with SOAP. It’s important to consider how this kind of application differs from the other two applications we have converted. Of course, certain principles remain the same, no matter what type of application you create. For example, you’ll still need to verify that SOAP can handle the data types in your application. It’s also important to check the WSDL file for problematic conversions and inefficiencies. A full-fledged database application sends and receives data. This two-way communication is problematic when using SOAP because SOAP is a stateless protocol. Every time you call the server, you’re starting from scratch. The server won’t remember anything you set up earlier. This means that you’ll have to make every query self-contained. The query will need to open the connection, make any required requests, obtain a result, and close the connection. Of course, this makes using SOAP a lot less efficient than working with DCOM. This is the strict interpretation of SOAP and you should follow it if you’re working with more than one platform. Several vendors have concluded that some classes of SOAP applications will need to maintain state in some way. Database applications fall into the stateful category. It’s inefficient to treat each query as a new conversation with the server. However, since SOAP doesn’t know how to maintain state, you have to be resourceful in creating a SOAP database solution. It’s possible to maintain state outside of SOAP. You’ll have to jump through some hoops to do it, but you can do it. Here are some tips for creating a stateful communication with SOAP. ■
Store the state information separately from the SOAP session. Some developers suggest using a text file, others the registry, and still others Active Directory. The point is that you need some external means of storing and retrieving the data.
■
Ensure you track the user’s information. The only way you can restore state information is if you can associate the data with a particular session. The best way to do this is to restrict the user to a single session, and then track the user identification information. Since you’ll need to secure any database transactions over the Internet anyway, you’ll always have user identification information at your disposal.
■
Provide some type of “watchdog timer” support for lost connections. DCOM and other protocols periodically check the connection with the user by pinging the remote connection. You can simulate this support by maintaining a local timer. Every time the user makes a transaction, the application resets the timer. If the timer runs out, then application rolls back any pending transactions and the state information is lost.
■
Include some type of transaction support. SOAP may actually provide this support as an added feature by the time you read this. Developers currently use SOAP Actor support to maintain transactions. A central server keeps track of the various transactions.
■
Maintain state-related code in a separate component. In many cases, you can use this code for more than one component. The cost in coding hours of creating state maintenance components is high, so you’ll want to get as much use out of the code as possible. Make sure you use a modular approach that will work in a variety of situations.
Modified Application Concerns
155
Once you have code that will maintain the user’s state, as well as provide transactional support, you still need to consider the unreliable nature of the Internet. Working with the Internet means accepting the possibility that some transactions won’t complete. Not only do you need to maintain a strict multi-tier approach to coding your application, but also you need to buffer the data in some way. It’s important to maintain the integrity of the main database. Some developers rely on a buffering database that the server can scan on a regular basis for errors. If the database integrity is maintained, the server commits the buffered records in batch mode—incorporating them into the main database for use by others. Of course, you’ll need to provide some level of training for users to ensure they understand this approach to maintaining data integrity. The records an employee creates on the road today won’t show up in the main database until after the next update period. Some developers are reading this section and shaking their heads. It’s true that many developers view SOAP as the worst possible protocol for performing any sort of database work, especially record updates and additions. Pure SOAP probably isn’t up to the job. You really do need something that is completely reliable for mission critical applications. In some respects, this brings us back to the DCOM application you’re using today. As mentioned in the very beginning of this chapter, you always need to consider that the protocol you’re using today is the best one for the job. SOAP does add many new capabilities to the programmer’s toolbox, but it isn’t the end all of programming technologies.
Modified Application Concerns I wish that I could say that every application you upgrade is going to work perfectly right out of the testing lab, but you’d know that I was lying. Even applications that you create from scratch and thoroughly test using a large beta group are likely to have problems. If you ever doubt this statement, just ask Microsoft about the problems that they’ve experienced over the years. If anyone should be able to create the perfect application, they certainly have the resources and testing base to do so. Converted applications are even more likely to have problems than ones you create from scratch are. Not only do you have new code to consider, but you also have problems with the assumptions made for the old application code to consider. In addition, updating an application implies that you’ll spend time looking at code someone else wrote. Depending on the skill of the other programmer, you may find yourself fixing their bugs as well as your own. So, it’s probably a good idea to plan on spending time looking at problems with your converted application when you first release it. The following sections are going to look at the three major areas of modified application failure that you’ll need to consider in addition to what we’ve talked about already: reliability, security, and performance. If you’ve followed the advice in the previous sections, you’ll already have considered problems with the protocol, code modifications in existing modules, and code problems in new modules. You’ll have also considered some usage problems, such as users who are well acquainted with one method of working with the application and refuse to learn anything else.
Ch
5
156
Chapter 5
Migrating an Application from DCOM to SOAP
Reliability Application reliability is one of the major areas of concern for a modified application. DCOM is a two-way communication where the client and server have constant contact. The client receives verification that certain events have taken place when working with DCOM, so you can validate the exchange of data. In addition, you can easily add transaction support to a DCOM application so that the client always has absolute verification that a certain set of actions has taken place. SOAP offers no such guarantees right now because it provides only one-way communication and there is no guarantee that the server will respond to client requests. In addition, there’s currently no way to create a transaction using SOAP, which means you no longer have the option of checking the effect of a SOAP message exchange on the content of the database. Microsoft and other vendors are currently working on remedies for this situation, but in the mean time, you’re on your own. You could add code that would simulate the effects of using transactions by using four message transfer stages for each transaction, but consider the significant hit that your application would take in performance. The Internet isn’t the fastest method to transfer data to begin with; adding an additional set of request/response messages will only make matters worse. One of the problems that I’ve run into quite often is that an application’s developer makes certain assumptions about the presence or lack of error information. An application may assume that it can verify a transaction between the client and server. It raises an error when it can’t perform the validation, even if the transaction took place. The error handler might further rely on application features that aren’t available when using SOAP, making it likely that it, too, will fail. In the end, an application can fail, not because of any actual error, but because of errors in the developer’s assumptions. We discussed the solution for error handler problems in the “Error Handling Code” section of the chapter. Some reliability problems that you’ll face are the result of using the Internet as a data transfer media. The Internet is inherently unreliable—trusting the Internet to provide reliable data transport is akin to trusting the postal system to deliver a letter. Generally, the postal system gets a letter to its intended destination, but you have no guarantee this will occur and therefore need to accept some loss of reliability as part of the price of using the Internet. In short, you need the transaction processing provided by DCOM, which is unavailable today. Until the Internet becomes more reliable, you’ll need to add extra code to your applications that detects errors that could occur because of a dropped line. User unfamiliarity with Internet applications can also prove problematic. User training resolves part of the problem by showing users the proper techniques for data entry and other tasks. However, SOAP applications can create certain classes of problems that you can’t avoid. Laptop keyboards are notoriously small and even a skilled typist will make more entry errors. The character recognition features of PDAs can also prove difficult and you need to check for new classes of data entry error that you may not have had to consider before. In addition, users can get frustrated if they end up sending too many partial entries to the server that are rejected later. A PDA screen is small in the extreme, making it likely
Modified Application Concerns
157
that a user will click on what they think is an application feature such as help, when the button really sends the data to the server. Until vendors devise a reliable transaction processing methodology, you should consider limiting distributed application data sent by SOAP. Don’t bet the company on a protocol that isn’t as reliable as the data transfer protocols you used in the past. Many distributed applications will work better in mixed mode simply because older protocols like DCOM are more reliable.
Security Modified applications present several security problems that you might not have had to worry about with the original application. The most pressing problem is securing the data transfer itself. This isn’t a problem when using DCOM because you can secure the application using standard DCOM settings. DCOM offers a wide range of security features. Remember from previous discussions, however, that SOAP doesn’t offer any security at all. You need to add security as a separate feature, which means you can’t integrate the security fully with the application. Any time you have a security solution that works separately from the application, you open the possibility for security holes. You can’t guarantee a safe data transfer because SOAP itself makes no such guarantees. Creating dual mode applications also means checking two separate paths for security problems. Crackers love overly complex systems precisely because they are difficult to secure. The dual mode nature of your modified application does reduce risk by isolating the external modules from those used locally. However, the complexity that this configuration adds also increases the potential for a failure due to security concerns. Working with distributed applications means opening your network to the outside world. After all, even if your network is secure, the user on the road is still a point of access that a potential cracker could exploit. Consider the number of recent break-ins where an employee accessing a corporate network from home opened the door for a cracker invasion. In fact, the security of remote machines has become a major issue. Many companies are looking for ways to increase their network security by securing these home machines. The modified application is also a slave to the security provided by the Web server. SOAP accesses your network through an Internet connection that the Web server handles. If the Web server is less than secure, then so is your application. This problem is severe enough that the Federal Bureau of Investigation (FBI) recently starting issuing warnings to major corporations in danger of compromise by crackers. The problem is simple, these companies haven’t applied all of the required security patches to their severs. The bottom line is that the security of your modified application is now in the hands of a network administrator who may not apply all of the patches required to keep the Web server safe. Your own code is a potential source of security problems. Assumptions made by a previous developer can affect the operation of your application. The code you use to make connections might assume that there’s a secure DCOM connection in place. It might not have the additional code required by SOAP to make the connection secure. In short, if you simply add the code required to make the application work, you might be opening a huge hole in security.
Ch
5
158
Chapter 5
Migrating an Application from DCOM to SOAP
Performance As part of the process of modifying an application, you need to ensure the application makes use of all modern performance enhancing features. DCOM has been around for a long time so there’s a good chance that the application you modify today will use the programming techniques of yesterday. Those techniques didn’t always consider performance because the performance feature may not have existed. The choice of object pooling and threading options is important because SOAP applications rely heavily on them. Anytime you can service user requests from cache, rather than making another call to the hard drive, you have increased the performance of your system. Likewise, calling on multiple servers almost always results in a performance hit. It’s important to keep the number of calls to a minimum whenever possible. Remember that we modified the organization of the application at the beginning of this chapter. The purpose of this change is to allow internal and external call separation to ensure data integrity and application reliability. Using a segregated model also improves security. Whenever you make changes of this sort you unfortunately complicate the performance picture. Even the DCOM version of the application might not work as quickly after the change simply because you need to call more modules to complete any given task. Measuring performance is difficult at best. A SOAP application doesn’t have the same connectivity that a DCOM application does, so you need to consider a myriad of factors when testing performance. You can break down the performance factors for a SOAP application as follows: ■
Client message creation
■
Transmission
■
Web server handling
■
XML parsing
■
Server processing
The three factors that affect performance the most are transmission, Web server handling, and XML parsing. You need to factor out the transmission time by performing a comparison with utilities such as Ping. The transmission time will vary significantly given line conditions, Internet loading, number of hops, and server loading. Retaining this factor in your performance figures gives a false indication of application performance by making the application appear slower than it actually is. However, you should factor in an average expected transmission time and make sure that any reports you create provide realistic best and worst-case scenarios. Web server handling delays often occur due to improper configuration. For example, most servers offer a keep-alive setting. Not using this particular feature means the server creates a new session every time the client makes a request—wasting valuable time. In addition, you need to check for bandwidth throttling and any TCP delay settings on the socket. All of these configuration issues will cause delays with your application that will make it perform slowly when compared to DCOM.
Troubleshooting
The XML parsing issue is problematic. Many developers complain that the current “offthe-shelf” XML parsers provide abysmal performance characteristics. One developer has gone so far as to write a custom XML parser with better performance characteristics (http://www-106.ibm.com/developerworks/xml/library/x-elexml/index.html). The tradeoff of using a custom XML parser is that some SOAP toolkits will break or not provide the anticipated results. For example, the Microsoft SOAP Toolkit relies on their XML parser version 3.0 or above. Even the older versions of the XML parser won’t work with this toolkit. The client and server processing delays are the factors most under your control. Creating components that perform tasks in an optimized way is the best insurance you have for providing good application performance. Make sure you allow for external setting of any delays within the component so that the administrator can tune your component to reflect current conditions. Validate client and server performance locally using a LAN so external factors such as transmission times don’t affect the tuning process.
Troubleshooting This chapter has shown you how to upgrade existing applications to use SOAP. In many cases, you won’t convert the application completely; you’ll add SOAP as a new feature to the existing application. Even if you do convert the application completely, you can still make use of the code you already have in place and upgrade the techniques you use to perform tasks such as handling errors. The following sections contain some of the questions that developers seem to have about upgrading their current applications (based on newsgroup and list server input). Which DCOM applications are the best candidates for conversion? This is one of the harder questions developers have to answer because it depends on many factors. Sometimes the determining factor is more related to the money a company has to spend on the upgrade, than to the technical feasibility of the project. When looking at an upgrade project, you should always consider the following issues: ■
Do you have all of the required source code?
■
Did the originator document the source code well?
■
Do you have all of the accompanying documentation so that you can create a block diagram of the application?
■
Will it require more time to upgrade the project than create a new one from scratch?
■
Is SOAP even the right solution for this distributed application project?
Will a SOAP application perform as quickly as its DCOM counterpart? Of course, the real question in many situations is whether the DCOM application will work at all in the distributed environment. One of the biggest reasons for the creation of SOAP is to allow applications to work across firewalls and other Internet security measures. In some cases, you may not have any choice other than to use a technology such as SOAP to upgrade
159
160
Chapter 5
Migrating an Application from DCOM to SOAP
your DCOM application to work in a distributed environment. However, you can also summarize the performance factors for a SOAP application as follows: ■
Message size
■
Availability of transmission bandwidth
■
Security requirements
■
User requirements
■
Performance of any new client platforms
What are the interoperability considerations for a SOAP application versus those of DCOM? DCOM applications work on the Windows platform and that’s it. When you reduce the number of platforms that a technology will work on, you also reduce the number of potential interoperability problems. SOAP will work on multiple platforms. As a result, you’ll face more interoperability problems when using this technology. The number one interoperability problem is the data types used by the component versus those that SOAP supports. Appendix A will fill you in on the details. Anyone who is familiar with XML knows that it only supports certain data types and doesn’t support some of them the same way that the native platform does. SOAP supports a subset of the XML data types, so you’ll likely spend time figuring out ways to convert existing component inputs and outputs to something that SOAP can understand. Another area of concern is vendor support. We talked about this issue a little in Chapters 1 and 2, but it bears repeating here. You need to consider the capabilities of the SOAP toolkits you need to use as part of the interoperability problem. For example, some toolkits support WSDL files, while others don’t.
CHAPTER
6
Creating Remote Access Utilities In this chapter Introduction
162
An Overview of Remote Access Utilities Writing a Server Status Viewer
162
172
Creating a Simple Employee Check-In Application Project
193
182
162
Chapter 6
Creating Remote Access Utilities
Introduction Some of you might have assumed that everything we’ve talked about so far deals with some type of remote access. Actually, the examples in the previous chapter will work just fine in either a local or remote access scenario. You can use the previous examples equally well on a LAN and the Internet. Of course, many SOAP applications will emphasize remote access because that’s one of the main reasons to use SOAP. This chapter will emphasize remote access utilities. These are smaller applications that you would normally use an application to perform locally. For example, we’ll look at a server status application in this chapter. Normally, you’d use an administrator utility, such as Microsoft Management Console, for gaining access to server status information locally. Getting server status information can prove difficult from a remote location; using a SOAP application can reduce that problem. Using a SOAP application will also allow you to share some types of information with those outside your company without revealing all of your company’s secrets. A server status utility will also provide a service for customers who want to know whether your system is ready to receive orders or perform other work. The first section of the chapter, “An Overview of Remote Access Utilities,” discusses several remote access utility development issues. We’ll talk about how you can use remote access utilities for your business, some of which I’ll demonstrate in the application programming examples. This section will also tell you about working with Web Services. Many businesses consider Web Services the next “killer application.” Finally, we’ll talk about development issues, such as application flexibility, using existing components, security, and other problems that you normally don’t need to worry about. The second section of the chapter, “Writing a Server Status Viewer” is an administrator tool example. We’ll discuss how you can use SOAP to create remote monitoring software. Of course, one of the more common monitoring requests is status information. SOAP applications also need to support the user on the road. The third section of the chapter, “Creating a Simple Employee Check-In Application” shows one user-support technique. This is a relatively simple, but common task that businesses perform. We’ll look at other applications that support the user on the road as the book progresses. Finally, the “Project” section of the chapter will lead you through the process of creating a developer utility of your own based on all of the information we’ve accumulated so far. The purpose of this section is to provide you with hands-on time with this new technology. This section is less example-oriented—more hands on training time.
An Overview of Remote Access Utilities What is a remote access utility? That’s a good question to ask because many people have a preconceived notion of what this type of utility is without any basis in fact for their opinion. Many developers view remote access utilities as small applications that perform a single task. It just happens that the utility also performs this task across the network wire, making a remote access utility rather than a local utility. Other people view remote access utilities as
An Overview of Remote Access Utilities
163
some type of new application. Some developers go so far as to equate remote access utilities with Web Services. In short, we need a working definition of remote access utilities before we can even explore the genre in light of SOAP application development. The following sections will help clear away some of the confusion. We won’t define a remote access utility as a single application type, but as applications with specific characteristics. Many of you will recognize your personal definition in this section, but that definition will also appear with other utility types you might not have considered. The concept of remote access utilities is broader than you ever thought. Certainly, I was surprised at the diversity of applications that people came up with when questioned about the topic.
Uses for Remote Access Utility Applications It’s important to understand that SOAP, like any protocol, has limitations that you need to observe. For example, one question that I saw online was how to pass a java.awt.Image from server to client using SOAP. The response to this question was that if you want to pass a java.awt.Image, you should use Java Remote Method Invocation (RMI). Using Java RMI allows the developer to use native coding techniques. In short, SOAP isn’t the end-all answer to every programming need, even when creating a utility. Of course, the problem of passing an image from a server to a client has a SOAP solution. One of the suggestions is to pass the image as a Graphic Interchange Format (GIF) byte array. If you look at Appendix A, “SOAP Data Types and Data Type Conversions,” you’ll find that this is one of the native formats for SOAP because it’s also a native format for XML. Some developers suggest using a data stream for the job and allowing the SOAP server to format the message as a Multipurpose Internet Mail Extensions (MIME) attachment. The point is that you need to use SOAP in the way that the designers intended. This leads to the discussion at hand: how you can use SOAP for remote access utilities. Downloading graphics images is one example of a remote access utility. You don’t need a large program to download the graphics. The only time you need a large application is if you want to allow the user to manipulate the graphics in some way. This differentiation is one of the things you should consider for a utility program. A utility normally performs several simple functions, some of which may be a subset of what you would get when working with a larger application. The following lists some of the best uses for utility programs when working with SOAP: ■
Monitoring: One of the most important classes of utility application is the type that monitors some type of activity. Microsoft has stuffed Windows with utilities that perform this service. Many network administrators also know Unix for the utilities that it supports. Some of these utilities operate at the command line, while others support a GUI. Many support some type of remote operation to allow a local use to check the status of another machine on the network. SOAP can extend the capabilities of these applications from the LAN to the Internet.
■
Simple Data Manipulation: We’ve already talked about one form of this remote access utility. Manipulating graphics in some way is a task that most developers will perform at some time in their career. The simple act of moving a graphic from the local network to
Ch
6
164
Chapter 6
Creating Remote Access Utilities
the Internet can prove problematic without the proper protocol support. SOAP provides a method for smoothing the movement of data, but it can also allow you to access and manipulate the data from a remote location. ■
Services: Developers usually call utilities that provide access to some type of resource a service. These utilities do everything from monitoring the number of users accessing your server to updating the time on all of the network workstations. The Internet has extended the concept of service from the network to the world. SOAP provides the means for accessing these Web services. However, you don’t have to reserve SOAP for those services that everyone will access—SOAP works fine on the LAN as well.
■
Configuration: Utilities often allow users to configure an application, an operating system, a component, or another resource. In some cases, utilities allow centralized configuration of many resources. For example, developers commonly use a single utility to configure all of the hardware on a machine. In other cases, a single utility will allow configuration of many different objects. For example, operating systems often handle security configuration using a single utility. SOAP simply allows you to extend your ability to configure resources from the local workstation or network to the Internet as a whole.
■
Maintenance: Keeping a system running properly requires maintenance. However, performing maintenance on a large system is often a painful task because utilities might require local access. In recent years, the use of agents has allowed remote utility use, but primarily over the network alone. SOAP will extend the maintenance utility by allowing the network administrator to gain access to low-level information across the Internet.
■
Task Scheduling: Automation of all types is a requirement in today’s harried world of overworked network administrators. Most operating systems include a simple task scheduler that works for most situations. However, developers still have to create custom schedulers for complex applications. The scheduler often takes the form of a utility that resides outside the main application.
■
Other: Only your imagination limits the ways in which utilities appear within your company. You’ll find that many developers create and use utility applications without much thought. They represent the small, single task applications that someone created for a simple need. SOAP merely extends the ways in which you can use the simple utility within your organization and those outside it.
Remote Access Applications—Not Necessarily Small Most of the remote access applications you run into will provide simple, single-task functionality. In many cases, remote access applications are also small and easily described. However, these two features describe only most of the applications out there. A few remote access applications are quite large and conceivably complex in implementation, albeit simple in concept. Consider the current effort by Ariba, IBM, and Microsoft to develop the first truly functional “yellow page” service for the Internet. The Universal Description, Discovery, and Integration (UDDI) registry will provide a telephone book-like service for businesses worldwide. Most people would agree that a telephone book is relatively simple in concept, but complex in implementation.
An Overview of Remote Access Utilities
165
It seems that this UDDI effort is a little more robust than the average telephone book. Businesses will actually participate in three types of telephone book entry: white, yellow, and green. A white page book will contain general business information, such as name, address, telephone number, e-mail address, and Web site URL. The white page book will also contain information about the types of services that a business provides and which protocols it supports. As with most white pages, the supporting vendors will organize this one in alphabetical order by business name, but given the realities of computer flexibility, you’ll probably be able to sort it in any order needed. The yellow page book will tag businesses with government operation codes. These codes standardize the business classification, such as dry cleaners or consulting services. The vendors intend to organize the yellow page book by operation code, geographic area, international naming protocols, and technology-based naming protocols. The green pages are where this book comes into play. They describe what types of documents a company can receive, entry points for transactions, and the types of technology the business supports. This information will help you find companies that support the same documents that your company does. It means that you’ll be able to write a single client that can interact with multiple companies, even if you don’t know what protocols that company will support now. In short, this type of remote access application will save time in the future.
Understanding the Web Services Difference I mentioned Web Services several times in the previous section. One of the biggest reasons to use SOAP with a utility is to provide resource access to those outside of your company. The resource could be something as simple as the local time or as complex as your company’s catalog. The utility could provide one-way access in the form of an informational display, or two-way access in the form of a data manipulation program. The bottom line is that the utility provides a service that people outside your company might need.
Don’t get the idea that Web Services necessarily equate to component technology. A Web Service can be any resource that a vendor wants to make available to the public at large using a standardized access method. In addition, Web Service access doesn’t have to rely on SOAP; you can access Web Services using other techniques. This section of the chapter is looking at one possibility for Web Service support. A discussion of Web Services as a whole is outside the scope of this book. We’re looking at a specific view of Web Services. Ch
One of the most obvious ways in which a Web Service differs from the utilities you’ve created in the past is that people outside your company will need to access it. This means that you’ll want to polish the appearance of the user interface a bit more than usual and might want to provide some form of help with the service (if possible). Security is also an essential part of a Web Service. You want to ensure that people using Web Services won’t gain access to parts of your company that you want to remain hidden. Reliability is also a concern. Your customers won’t be impressed if the service your company provides constantly breaks. Web Services also differ from the typical utility in that you’ll need to advertise them in some way. If you only want selected customers to know about the utility, then a simple phone call or e-mail announcement will suffice. However, if you’re like most Web Service developers,
6
166
Chapter 6
Creating Remote Access Utilities
you won’t even know the people using your service—at least not immediately. This is where other protocols, such as UDDI, come into play. These other protocols help make your SOAP applications visible to others. Some vendors, such as Microsoft, are tying Web Services to their applications. The application will theoretically work fine without the Web Service, but adding the Web Service increases application functionality in some way. By charging a small amount each time a user accesses the service, the vendor can create a continuous revenue flow. In this case, the Web Service is a value-added packaging methodology.
Web Services is becoming an increasingly big issue as companies seek ways to connect through the Internet. You’ll find many viewpoints about this technology. Get the IBM view at http://www.alphaworks.ibm.com/tech/webservicestoolkit. The Microsoft perspective appears at http://msdn.microsoft.com/webservices/. You can find a wealth of discussions at http://search.userland.com/ default?s=1&m=50&q=Web+Services&site=All.
At least a few vendors also see the value of Web Service for collaboration. For example, a travel agent could add your hotel reservation component to their application. Every time they need a hotel reservation that your company can fulfill, your component makes the required entries in the database. Using Web Services this way means that companies can create extremely efficient and complex applications using component parts. The best part is that the companies that use your Web Service don’t need to know anything about your database or the method used to access it. All they need to know is the standard document format that your company uses for exchanging data.
When Is SOAP Overkill? Some developers will ask whether they need to use SOAP for remote access utilities at all. The answer can become complex. A short answer is that it depends on what you expect the application to do when complete. SOAP truly is overkill for some types of applications, especially if your sole purpose in implementing the SOAP application is getting around a pesky firewall. For some experts, the keyword for SOAP applications is interoperability. If your application will never see use outside your company, then a solution that relies on ASP might be all you need. It doesn’t pay to use SOAP if a simpler technology will work. However, it’s hard for most developers to determine at the outset of a project whether outside entities will use the data from that project. Determining interoperability concerns at the outset of a project is important, but we all know those concerns will change as the project develops. Other experts see the question as one of complexity. SOAP allows you to overcome specific types of obstacles when working with Web applications. Although an ASP-based application might work for viewing static data, these experts argue that modern applications require greater flexibility. SOAP can provide much needed flexibility in Web-based applications. However, if all the user will ever do is view data, SOAP could be overkill. You’ll normally
An Overview of Remote Access Utilities
167
reserve SOAP for situations in which some type of data exchange (one-way or two-way) takes place. Another concern for remote access utilities is longevity. SOAP does require an investment in time and effort. Although this investment will become less as SOAP toolkits mature, you still need to consider the investment. You’ll waste that investment if the company uses the application for a short time and then discontinues in favor of a new solution. Utility programs are risky in this respect. Many people begin developing a utility as throw-away code for a task of the moment. In some cases, the utility outlives the originally intended use. In many cases, though, it dies a quick death when the developer finishes using it.
Making Utility Programs Flexible One of the ways to hedge your bets with utility programs is to make them both modular and extensible. Modular utilities tend to perform one task well per component. Modifying such a component to perform another, similar task is usually simple. In fact, if you design the component well, you might be able to use it for multiple tasks without modification. Extensible components contain code that allows augmentation of a particular function. For example, a configuration setting might adapt the component to other purposes. A basic math module might perform different tasks depending on the equation that you feed it as part of the input. (This is obviously a simple example, but one that clearly illustrates the point I’m trying to make.) You can create configuration settings for components in a number of ways. Here are the techniques I use from least to most difficult. 1. The easiest method is to pass a configuration string as one of the arguments, but this isn’t always practical and can cause security problems. 2. The second method is using registry entries that the component can check during the initialization phase. This method is a little more secure, but tends to get messy when you want to configure different instantiations of the same object in different ways. You’d need a registry entry for each potential configuration and some method for differentiating between them. 3. You could create the component as a COM+ component and place the configuration settings in the COM+ application or within the COM+ Catalog. This method is more secure than the registry method because the settings remain on the server. It is also less messy when you need to configure the same component in different ways. Using COM+, however, adds complexity to the application that you might not want to support. 4. Placing the settings in a stream and saving them to disk is another method. The component can load the property bag and use the settings it contains to configure itself during the initialization phase. This method is less secure than COM+, but more secure than the registry method because the settings are still located on the server. The methods for working with a property are well documented and precise. However, some developers find that this technique has all of the problems of the registry method without the ease of access that the registry provides.
Ch
6
168
Chapter 6
Creating Remote Access Utilities
Shortcuts for Using Existing Components with Utilities We all have “junk” components sitting around. They were useful at one time, but they’re not usable today for whatever reason. Most of us avoid creating new junk components because of the time wasted. Creating and debugging a component takes time and effort that we could use in other ways. Developers are in a constant time crunch, and the situation only promises to get worse. The big question, then, is how to use existing components within SOAP applications. The surprising news is that some components will run just fine as they are right now. We’ve already discussed this fact as part of the Distributed Component Object Model (DCOM) discussion in Chapter 5, “Migrating an Application from DCOM to SOAP.” Converting old applications isn’t only possible, but it’s also the recommended method by Microsoft and other vendors. Large vendors want to encourage code reuse so that you’ll try their new technology. It’s in their best interest to make sure you can use as much code as you can. I’ve actually managed to resurrect some aging components from the scrap heap by using SOAP. This might seem somewhat amazing until you consider the fact that we’re taking a half step back in the component technology development. All SOAP toolkits support DLL (in-process) components—the same components you’ve been using on the desktop for ages. Remember that out-of-process (EXE) components didn’t really become popular until DCOM arrived on the scene, and developers had to make the switch. Some, but not all, SOAP toolkits also support the EXE form of component. Unfortunately, the Microsoft SOAP Toolkit isn’t one of those that do support the EXE format. If you’re using Visual Basic, switching to a DLL format might be as easy as loading the project, changing the Project Type setting to ActiveX DLL (see Figure 6.1), and recompiling the application. However, what happens if you don’t have the source code for the component or the component isn’t easy to change to a DLL format? Many developers create a DLL component that loads and accesses the EXE version. This method has the disadvantage of increasing latency and using more server resources. This might not be a problem if only a few users require access to the component and the component is small. The “second tier” component approach also works for cases in which the old component uses data types that SOAP doesn’t recognize and you don’t want to go to a lot of effort to rewrite the code. You can use the second tier component to translate the data types from SOAP into something the older component will understand. Of course, now you’re adding still more latency and resource usage to the picture. Another approach to handling incompatible data required by an older component is to write a user-defined type (UDT) for SOAP. In this approach, you’ll create the Web Services Description Language (WSDL) file as usual. It will contain areas with question marks where the utility didn’t understand the data type that the component used. You’ll need to edit the file manually to define the unrecognized data type and provide code to handle the data transfer.
An Overview of Remote Access Utilities
169
Figure 6.1 Sometimes, switching the project type is all you need to do.
Utility Program Security Issues Utilities have all of the same problems that most applications do when it comes to security. Any data you transfer is open to scrutiny until you secure it; users are just as apt to dig for hidden secrets as they are when working with a database application. In fact, you might find that users are even nosier than usual because they’ll assume that your utility program isn’t as secure as the big application. Utility programs do have an advantage in that the data they handle is normally less sensitive than your typical database application. In addition, utilities normally transfer small amounts of data, making it harder to intercept something that you wouldn’t want someone else to know. Some security issues are peculiar to utility programs. I’ve already mentioned one potential security issue for utility programs: setup strings. Crackers will use even the smallest cracks in your security to cause problems. In fact, they tend to like small, unnoticeable problem areas. That’s why you’ll want to be sure that if your utility relies on configuration settings, you act to secure them. In some cases, the settings might be so innocuous that you don’t have to do anything at all, but this is likely to be the exception, rather than the rule. The fact that utility programs often provide low-level access to your server or the applications it supports is also a problem. I know that I wouldn’t want someone outside of my organization to use some of the utilities that I create. Sure, they provide useful functionality, but that functionality is such that the security of my system would be at risk if someone else accessed them. Not all utilities have this problem, but you’ll still want to consider the issue when you create the SOAP version.
Ch
6
170
Chapter 6
Creating Remote Access Utilities
Of course, the biggest threat to security when working with utility programs isn’t the cracker lurking outside your firewall with evil intent—it’s the employee working freely inside the firewall whom the boss fired just a moment ago. Directing your attention outside and forgetting the inside of your organization is asking for trouble. Utility applications are small, fast, and specifically designed to perform one task well. Someone bent on destruction after a bad day in the office won’t have to go through much effort to wreak havoc on your system. The very feature that makes utility programs a must-have also makes them incredibly dangerous for the unwary developer.
Non-Issues for Utility Programs The simple nature of utility programs means there are issues you don’t need to consider. For example, it’s unlikely that you’ll manipulate a database with your utility program—that’s normally the domain of larger database applications. As a result, you’ll find that you don’t have to have multiple tiers of components to process requests reliably. Utility programs are simple. The following sections will help you understand some of the non-issues for utility programs a bit better. These are issues that you’d normally think about when creating an application, but that you can disregard when working with utility programs. The first non-issue is a special SOAP header processing consideration. Utility programs normally provide direct access to the data they manipulate, so you don’t have to worry about actor support. We’ll look at other issues as well.
SOAP Headers Processing One of the issues that developers worry about is how SOAP will treat SOAP messages with multiple header entries. Each entry defines a special processing need. You might want one server to check the reliability of the message and another to check for any log entries associated with the message. For example, look at this multiple entry SOAP header. (This header is by no means typical—I’m only using it for discussion purposes.) .... .... ....
An Overview of Remote Access Utilities
171
The details of this header aren’t important. Consider for a moment, however, that different servers process the authenticate, transaction, and log entries. Server A will process the authenticate header and pass the resulting message to Server B. Note that we’re talking about the resulting, not the original message. Server B will process the transaction header and pass the resulting message to Server C. Server C will process the log header and pass the result to the client. In short, SOAP relies on a processing change, not a centralized processing method. This differs from technologies such as COM+, where a central authority monitors process. What happens if Server B experiences an error? Instead of passing the message to Server C, Server B will report the result back to the client. Header processing always stops after the first error. This means the client gets immediate feedback and the system doesn’t waste time processing a request that won’t succeed anyway. In some ways, this makes SOAP more efficient than some technologies such as COM+, where all servers perform the requested task and then vote on the outcome of a transaction. This methodology has implications for the developer as well. It means that you’ll end up fixing one error at a time. Even with proper development tools, the fact that SOAP stops processing on the first error means that you’ll fix one error at a time when debugging an application. Any developer who has spent much time working with distributed applications knows that the one-at-a-time approach is the most time-consuming method to debug an application. Given the way that SOAP handles headers, you’ll want to set additional time aside for debugging when working in this environment.
Interoperability We’ve spent a lot of time in this book discussing the issue of interoperability. From the beginning of Chapter 1, “An Overview of SOAP,” to the last example in Chapter 5, “Migrating an Application from DCOM to SOAP,” interoperability has been an issue because SOAP is a new protocol. The fact that servers from different vendors can communicate at all is a miracle considering that this never occurred when working with DCOM or CORBA. Microsoft never ported DCOM to other servers, and you’d be hard-pressed to find CORBA on every platform either. The imperfect implementation of SOAP we have today is definitely a step in the right direction, but it will take time for vendors to make things work correctly. You don’t have to worry about interoperability issues when working with utility applications for the most part, unless your organization sports dozens of incompatible platforms. People within the organization use utility programs that actually span the network, not those outside of it. Keeping the application for local use means you don’t have to worry about outside clients attempting to access your component. Utilities that are used by outside parties tend to use simple data types. For example, someone who wants to configure his or her user settings will normally send strings to your component, not complex data. Every SOAP toolkit on the market today supports the basic data types listed in Appendix A, so it’s unlikely you’ll run into interoperability problems in this arena either.
Ch
6
172
Chapter 6
Creating Remote Access Utilities
Attachments Remember that a SOAP message can appear as part of a Multipurpose Internet Mail Extensions (MIME) message. The other parts of the message can contain attachments that don’t transfer well using XML encoding, such as graphics. Utility programs rarely, if ever, need to transfer complex data of this type, so you seldom need to worry about attachments. Using the standard high-level messaging technique found in Chapter 4, “Using SOAP to Create a Simple Application,” makes creating messages much easier and faster. Obviously, there are exceptions to the rule, but you can normally reduce the amount of coding you need to perform. For example, consider the simple data manipulation entry in the “Uses for Remote Access Utility Applications” section of the chapter. This is, in fact, the only class of utility for which you’ll need to consider attachments. You might use a utility to move a graphics image from the client to the server. Creating generalized code to perform the task will allow you to write the application once and never worry about it again.
Experimentation Some developers I know are stressed about every line of code they write; they don’t have fun with their trade (in part, because they’re always under severe deadlines). Utility programs are my favorite kind of coding because they allow me to experiment. You can’t get away with writing a million-line database application and then throwing it away. Everyone will think you’re crazy. On the other hand, I commonly write small utility programs to test a new idea. I’ve discovered a lot about programming by sitting down and seeing what’s possible with a utility. The programs are small and no one cares if I throw them out later—not even me. This concept of toss-away code is foreign to many developers today because everyone is thinking of ways to save every line of code. Writing lots of code is expensive; it’s an investment worth protecting. However, utilities are cheap when it comes to coding time. You gain knowledge, and if you design the utility properly, you perhaps have a new piece of code to add to your library. The final non-issue for utility programs is the time invested coding them. The utility program is small, easy to understand, and performs just a few tasks well. The education you receive more than pays for the time you invest in creating one. The functionality you get is just icing on the cake.
Writing a Server Status Viewer This example will extend the example in Chapter 5 by showing another way of gathering server status information. We’ll add the ability to poll the status of all of the services installed on the server. Of course, you can make this application as robust as you want; this example provides a skeleton you can build on later. The important consideration in a multiplatform environment is finding statistics that every server supports, or at least customizing the client to automatically compensate for platform differences.
Writing a Server Status Viewer
173
The server-side component in this example works with the Windows API to poll the state of each installed service. You’d need to provide a different server-side component for every platform you want to support, but you can use the same component for all servers with the same operating system, such as all Windows machines. The client, however, could remain the same because all we’re passing is strings from the server to the client. As long as you maintain a consistent interface, you won’t experience problems with using the same client across multiple platforms. The servers, of course, will present interesting challenges because each vendor has a different way of measuring the current server state.
Creating the Server-Side Component The simple part of this component is that it requires no input and provides a string as output. The server-side component for this example performs low-level access of the operating system using API calls to complete the task. You’ll find that this is common for certain categories of utility components. A utility provides a “black box” for an outside application that makes it easy to perform system-level tasks. In this case, we’re polling the system service state on a server.
Polling a server for service information is relatively easy when you’re working with Visual C++, but presents some interesting challenges when working with Visual Basic. The ServicesDeclarations.BAS file found in the \Chapter 06\Service State Component directory of the source code available from the Que Web site will help make working with services in Visual Basic much easier. You can find the source code at http://www.quepublishing. com. The example component shows how to use the various service functions to obtain state information. You can also remotely add new services, delete services that you no longer need, and modify the state of existing services using the same techniques.
Another criterion for this component is that it performs this service in a server-neutral fashion. In other words, you should be able to place the component on any server and it should work without any configuration. In some cases, this would be a difficult requirement because many system calls require server-specific information. Fortunately, the API calls we’re using in this example don’t require precise server knowledge, so creating a generic component is relatively easy. Now that you have some idea of what we’re going to do and why we’re going to do it, let’s look at the code for the server-side component. Listing 6.1 contains the source code for just the component class. It’s important to remember that the component requires a lot more code to function in this case. The additional, generic, code appears in the ServicesDeclarations. BAS module as part of the source code available from the Que Web site.
The source code makes every effort to use API standard names for function calls and constants. You can look up these functions in any Windows API help file for a full discussion of all arguments and constant values. The DLLs do use alternative names for the functions to support both standard and wide (Unicode) character sets. You’ll find these alternate names in the ServicesDeclarations.BAS file.
Ch
6
174
Chapter 6
Creating Remote Access Utilities
Listing 6.1
GetStatus()
Method Source Code
Public Function GetStatus() As String Dim manager Dim Dim Dim Dim data. Dim Dim Dim Dim Dim Dim
SCM_Handle As Long handle. BufferSize As Long Result As Long Services() As ENUM_SERVICE_STATUS BufferBytesNeeded As Long
‘Result value or service control
ServicesReturned As Long ResumeHandle As Long BufferElements As Long Outstring As String ResultString As String counter As Long
‘Number of entries returned. ‘Handle for next service entry. ‘Number of buffer elements required. ‘Final output. ‘Intermediate output from array. ‘Loop counter.
‘Returned service lising buffer size. ‘Result value. ‘A list of the service status values. ‘The size buffer needed to hold the
‘Open the Service Control Manager (SCM). SCM_Handle = OpenSCManager(vbNullString, _ SERVICES_ACTIVE_DATABASE, _ SC_MANAGER_ENUMERATE_SERVICE) If SCM_Handle = 0 Then MsgBox “Failed to open SCM. DLL Error is: “ _ + CStr(Err.LastDllError), _ vbOKOnly Or vbCritical, _ “SCM Open Error” Exit Function End If ‘Determine the buffer size required. We should recieve an error ‘stating the buffer isn’t big enough. The returned values will ‘contain the required buffer size. BufferSize = 0 Result = EnumServicesStatus(SCM_Handle, _ SERVICE_WIN32, _ SERVICE_STATE_ALL, _ &H0, _ BufferSize, _ BufferBytesNeeded, _ ServicesReturned, _ ResumeHandle) If Not Err.LastDllError = ERROR_MORE_DATA Then MsgBox “Couldn’t determine the required buffer size. DLL Error is: “ _ + CStr(Err.LastDllError), _ vbOKOnly Or vbCritical, _ “SCM Open Error” Exit Function End If ‘Calculate the buffer parameters. Determine the number of elements ‘required. Redimension the array to contain that number of elements, ‘and then calculate the actual buffer size. BufferElements = BufferBytesNeeded / Len(Services(0)) + 1 ReDim Services(BufferElements - 1)
Writing a Server Status Viewer
Listing 6.1
175
Continued
BufferSize = BufferElements * Len(Services(0)) ‘Retrieve the service status information. ResumeHandle = 0 Result = EnumServicesStatus(SCM_Handle, _ SERVICE_WIN32, _ SERVICE_STATE_ALL, _ Services(0), _ BufferSize, _ BufferBytesNeeded, _ ServicesReturned, _ ResumeHandle) If Result = 0 Then MsgBox “Failed to retrieve service status information. DLL Error is: “ _ + CStr(Err.LastDllError), _ vbOKOnly Or vbCritical, _ “SCM Open Error” Exit Function End If ‘Interpret the data. For counter = 0 To ServicesReturned - 1 ResultString = Space(250) Result = lstrcpy(ByVal ResultString, ByVal Services(counter).DisplayName) ResultString = Trim(ResultString) If Len(ResultString) > 1 Then ResultString = Left(ResultString, Len(ResultString) - 1) Else ResultString = “N/A” End If Outstring = Outstring + ResultString + “ (“ ResultString = Space(250) Result = lstrcpy(ByVal ResultString, ByVal Services(counter).ServiceName) ResultString = Trim(ResultString) If Len(ResultString) > 1 Then ResultString = Left(ResultString, Len(ResultString) - 1) Else ResultString = “N/A” End If Outstring = Outstring + ResultString + “) “ ResultString = Space(250) Select Case Services(counter).ServiceStatus.CurrentState Case SERVICE_STOPPED ResultString = “Service is Stopped” Case SERVICE_START_PENDING ResultString = “Service is Starting” Case SERVICE_STOP_PENDING ResultString = “Service is Stopping” Case SERVICE_RUNNING ResultString = “Service is Running” Case SERVICE_CONTINUE_PENDING ResultString = “Service is Going to Continue”
Ch
6
176
Chapter 6
Creating Remote Access Utilities
Listing 6.1
Continued
Case SERVICE_PAUSE_PENDING ResultString = “Service is Going to Pause” Case SERVICE_PAUSED ResultString = “Service is Paused” Case Else ResultString = “Status Unknown or Error” End Select Outstring = Outstring + ResultString + vbCrLf Next GetStatus = Outstring ‘Close the SCM handle. CloseServiceHandle (SCM_Handle) End Function
As you can see, the code begins by opening a connection to the service control manager (SCM) using the OpenSCManager() API function. The SCM is the central authority for manipulating services on a Windows NT/2000/XP machine. As part of the request for access, you need to specify the name of a database to open and the level of access required for the call. We’re using the active database in this example—there’s also an inactive database. The call also tells the SCM that we want to enumerate services. You can request many other activities, such as changing the status of a service or adding a new one. The code calls the EnumServicesStatus() function twice. The first call determines the size of the buffer required to hold the service status information. This information doesn’t directly correlate to the number of array entries used to hold the status information, so the next step is to perform some calculations and redimension the Services array. We then need to set the final size of the buffer so that we can pass it along on the second EnumServicesStatus() call. The second call retrieves the service status information. Note that the Services array is 0 based because this is an API call. Figure 6.3 shows the complex data structure returned for each service. Figure 6.2 The Services array is a composite of two data structures.
Notice that each array element consists of two data structures. The first contains the enumerated service description, while the second contains the status information. The DisplayName and ServiceName entries are actually pointers to strings. The component uses
Writing a Server Status Viewer
177
the lstrcpy() API call to retrieve the actual string values. However, this call relies on a pointer to a string buffer, not a Visual Basic string. ResultString contains a series of zeroes to mark the end of the string. Unfortunately, the zeroes also make it impossible to concatenate the string properly, so we need to perform some adjustments as shown in the code. The status information is a series of flags that we can detect using constants. Listing 6.1 shows how to detect the current service status. You’d use similar techniques to determine other status information. The component code ends with a call to CloseServiceHandle(). Make sure you deallocate the handle or your component will quickly develop a memory leak.
Tips for Working with Server Status Information Many utility applications revolve around maintaining the server in some way. At least some of these utilities provide information alone; they don’t allow you to change the server configuration. A network administrator needs this constant flow of information to maintain the network properly. Consider how many command-line utilities check basic information such as the server’s IP address and the number of required hops to transfer data from one node to another. Microsoft recognizes the need for information by the network administrator and constantly improves their management suite, but there’s always room for more high-quality management tools. Most of the services that you’ll create require some type of enumeration. The sample in this section is a good example of such a utility. Servers often manage multiple copies of a single object. Providing resources to multiple clients is at the core of the server, so it’s not too surprising that you’ll enumerate (list) items regularly. Everything from the computers attached to the server to the processes requiring service comes in multiples. Building good enumeration routines and recognizing the need to enumerate data is an essential part of developing useful server status utilities. Obtaining service status consumes quite a bit of space, even if you ask for the least possible amount of information. If you look at the Services MMC snap-in (Figure 6.3) you’ll see that every Windows 2000 machine contains a number of services. Returning just the names and status will require more space than the typical dialog box provides. That’s why you’ll normally use a text box or other control with scrolling capability to display the information. SOAP seems to play an interesting trick with the data you transfer using it. When you send data formatted with carriage returns and linefeeds, the data arrives at the other end without linefeeds. According to several vendors, this is the anticipated action of the XML parser. The result is that the data that looks great in a dialog box won’t break within a text box. Figure 6.4 shows an example of what you’ll see. You’ll need to get around this problem in some way. The two best methods are character substitution at the client end or using an array to transfer the data elements separately. The sample application in this section of the chapter uses character substitution. Transferring a single string is still more efficient than using an array. In addition, the character substitution process isn’t difficult to implement. However, it’s a detail to consider as you create your application. Trying to figure out some problems like this can become brain-wracking sessions unless you’re already aware of the behavior of the components that transfer data when using SOAP.
Ch
6
178
Chapter 6
Creating Remote Access Utilities
Figure 6.3 Windows 2000 supports a number of services—too many to list in a dialog box.
Figure 6.4 The XML parser always strips linefeeds from outgoing data.
Don’t get the idea that you’ll always use a single string. Using an array to transfer formatted data has advantages as well. For example, the sample application could display the data using a grid similar to the one used in the Services MMC snap-in. I chose to create a simple status display for the example, but you might want to provide more details, and using a grid is definitely beneficial. In short, strings are better for performance, while arrays provide more flexibility in formatting the data at the client end.
Writing a Server Status Viewer
179
The utilities you create must be efficient. After all, querying the server status isn’t the central purpose of a server. Every resource your utility uses consumes resources that could service a user need. Normally it’s a good idea to limit the scope of information your utility will provide. For example, you’ll notice that the sample application queries only the active database. We could have limited the scope of the component in other ways to reduce the amount of resources required. For example, we could have retrieved a list of only those services that were running. Given the limitations of SOAP, you’ll want to avoid some performance-enhancing techniques. For example, it might be tempting to provide a partial list of data in the hope that the partial list will satisfy the network administrator’s requirements. This technique works when a live connection exists between the client and server. It doesn’t work well with oneway connections because the client application must create a new session if the network administrator requires more information. When working with a protocol like SOAP, you need to provide a complete answer so the component answers the request in one trip.
Working with the ISAPI Listener The process of working with the ISAPI Listener is about the same as working with the ASP file. In fact, the coding change required is so small that you might not even notice. When you want to work with an ASP file, you initialize the SOAP client like this:
Note that this code appears in the WSDL file, not in your program code. The program code will remain the same no matter which access method you use. Here’s the ISAPI version of the same component. Notice that only a single line of code changes between the two implementations.
The example program contains buttons for both versions of the call. You won’t notice a difference with this example on a test network. However, using the ISAPI form of the call can save considerable time and resources. Microsoft suggests that you normally use the ISAPI method of creating SOAP calls. Given that you don’t need to make changes to your client code and that you can generate the required WSDL files using the WSDL Generator utility supplied with the Microsoft SOAP Toolkit, experimenting with both forms is easy.
Ch
6
180
Chapter 6
Creating Remote Access Utilities
You might run into an odd problem when working with an ISAPI listener that you won’t run into when using an ASP listener for the same application. The Internet Information Server (IIS) Administrator for some versions of Windows has a bug that won’t allow you to enter paths with spaces in them. This prevents you from working with the SOAPISAP.DLL and might prevent your application from working. The first way to fix this problem is to move the SOAPISAP.DLL to a new location without spaces in the pathname. You can also use the Microsoft recommended technique of using the short default pathname of C:\PROGRA~\COMMON~1\MSSOAP\BINARIES\SOAPISAP.DLL. In addition, the Microsoft SOAP Toolkit has an error as of this writing. It says the SOAPISAP file is located in the Binaries directory of the toolkit. The file is actually located in the \Program Files\Common Files\MSSoap\Binaries directory. Make sure you keep this in mind when looking for files to update.
The ASP method does allow more server-side processing before making the SOAP method call. This particular feature makes the ASP method more flexible than using ISAPI. It’s also the reason you’ll use the ASP method instead of the ISAPI method more often than not. SOAP is still in its infancy, which means that server-side processing for the sake of interoperability isn’t out of the realm of possibility. After SOAP becomes an established standard, however, using the ISAPI method will become prevalent, at least for simple exchanges that don’t involve complex data or other advanced processing requirements.
Some developers are concerned about the potential security risks of exposing the WSML file to the user when using an ISAPI listener. The WSML file must appear in the same directory as the listener, which means you can’t place it in a secure directory in a separate location. The user could potentially read this file and use its contents to compromise the application in some way, depending on the application component. IIS must load the WSML file to enable access to the component. However, it loads the component on behalf of the user, not for the user. This means you could remove read access from the WSML file, which prevents the user from retrieving the file over the Internet. The application will still work because IIS loads the component on behalf of the user using the server’s access rights. This little change in security could keep details of your application from prying eyes.
One piece of magic isn’t immediately apparent when working with the ISAPI method. How does IIS know what to do based on the contents of the WSDL file alone? The fact is that the WSDL file doesn’t provide enough information to make the application work. Open the Home Directory tab of the Default Web Site Properties dialog box. Click Configuration and you’ll see an Application Configuration dialog box like the one shown in Figure 6.5. Notice that the highlighted entry is for WSDL files and that it points to the SOAPISAP.DLL we’ve been talking about throughout this section. IIS creates an application mapping that automatically calls upon the SOAPISAP.DLL to process WSDL files as needed. In fact, this entry is one of the first places you should look if you’re having trouble getting the ISAPI method to work.
Writing a Server Status Viewer
181
Figure 6.5 The ISAPI method relies on application mappings to do its work.
Figure 6.5 has another notable feature. Notice the Cache ISAPI Applications check box. IIS checks this option by default because caching ISAPI applications improves system performance. Unfortunately, caching the applications also causes problems when you’re trying to work with DLLs that IIS accesses. Because the DLLs are in memory, you can’t do anything with them after a call. Clearing this option allows you to debug your applications faster. Make sure you check this option again before you begin performance testing. Otherwise, the performance statistics you gather won’t reflect the realities of working with the application.
Creating the Client The client application for this example is similar to the clients we’ve created in the past. The two unique features of this client are that you can use multiple methods to call GetStatus() and the client includes a little extra code to add the line breaks the XML parser removes back in. Listing 6.2 shows one of the three methods for obtaining and displaying the server status information.
Listing 6.2
cmdRemote_Click()
Private Sub cmdRemote_Click() Dim Client As SoapClient Dim Outstring As String
Source Code ‘Create the SOAP client. ‘String used to hold the output data.
‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/ServiceStatus/ServerState.WSDL”, _ “ServerState”, _ “ServiceCheckSoapPort”
Ch
6
182
Chapter 6
Creating Remote Access Utilities
Listing 6.2
Continued
‘Get the Status. Outstring = Client.GetStatus ‘The XML parser strips all carriage returns, so we ‘need to replace them in the string for proper output. Outstring = Replace(Outstring, vbLf, vbCrLf) ‘Display the data. txtOutput.Text = Outstring Exit Sub ‘Display a message when an error occurs. ErrorHandler: MsgBox Client.faultstring, vbExclamation End Sub
As you can see, this method relies on the same basic techniques we’ve used in the past. Notice the use of the Replace() function to add the carriage returns back into the string. Of course, you also have to remove the existing linefeed. Figure 6.6 shows the final utility output. Figure 6.6 The Services Status Check Utility shows the current state of the services on any Windows NT/2000/XP server.
Creating a Simple Employee Check-In Application As corporations create more online connections for employees to use, they also want to account for employee time better. It’s all part of new resource management techniques designed to make companies more competitive. Of course, e-mail provides one method of
Creating a Simple Employee Check-In Application
183
remote monitoring, but there are some cases where an employee has no correspondence. One of the purposes of this application is to provide a simple means for an employee to log in—to provide some indicator that he is still alive and working on company projects. Using Schemas as Design Tools Many people view WSDL and other schema definition file types as a one-way conversion. In other words, you create a component and then use a tool to create a WSDL file based on that component. However, you can also use XML schemas as design tools, which means you’d design the schema first and then create a component that matches the schema. Don’t confuse this idea with the need to design interfaces for your application first. During the design phase, best practice dictates that you define all of the required interfaces between components so there’s no confusion as to what a component should do. The use of schema-based programming takes the idea a step further by saying the programmer will create the schema code (the WSDL file) before creating the component it will support. This system would actually work well on a large-scale development. The designer could create the various schemas and then ask subordinates to create the component that would match the schema and provide the required output given a specific set of inputs. The use of the schema would give the subordinate flexibility in creating the component, yet would assure that all of the components within the application would work together based on their various schemas. The existence of a schema file would also allow for easier component validation. Obviously, you’d need a WSDL design tool to actually make this work. The WSDL tool would have to present the XML it contains in some graphical format, just as other application design tools do today. Some tools approximate that behavior now, but none of them provides a full graphical implementation that would allow a designer to create an entire application. However, such tools are quite likely to appear on the market as SOAP and its associated technologies mature. Now you need to ask yourself a question. Which should come first, the schema or the component? The choice depends on how you want to develop the application and your personal programming preferences. For some people, the schema will come first because it allows a developer to see the big picture before delving into application coding.
Unlike many of the applications we’ve looked at so far, this is a classic one-way information example. The employee isn’t looking for feedback from the company. All the employee needs to do is check in. This means a one-way communication from the client to the server is all that we need to accomplish the task. It might seem at first that SOAP would be perfect at performing this task because it’s a oneway communication protocol. However, as we’ll see in the sections that follow, one-way communication remains elusive because of the assumptions made by toolkit vendors. We’ll also talk about some ways to fix the communication problems in a way that will make your application run more efficiently.
Creating the Component Active Directory opens many doors for the developer. As Windows 2000 becomes more prevalent, developers will use Active Directory as a corporate resource database. You could use this database to store anything. Theoretically, the database is completely extensible, which allows you to add entries as needed. The hierarchical format of the database allows a lot of flexibility in the way you define schemas for your data.
Ch
6
184
Chapter 6
Creating Remote Access Utilities
This example looks at Active Directory as a means for keeping track of employees on the road. The section isn’t a full primer on Active Directory—that would require another book, but I do include enough information to understand the example. Obviously, you could track the employees in other ways, but this technique has the advantage of allowing employees to check in as time and connections permit. The Active Directory Services Interface (ADSI) allows you to work with Active Directory using a standard set of interfaces. We’re using one of those interfaces in this example. Listing 6.3 shows one of two methods used to create an Active Directory user object. This object allows access to the Notes field of the Telephones tab (see Figure 6.7) of the User Properties dialog box found in the Active Directory Users and Computers MMC snap-in. Figure 6.7 The QCOne component writes entries to the Notes field on the Telephone tab of the User Properties dialog box.
Listing 6.3
Quick Check-In One Component Source Code
Option Explicit ‘Define some AD constants. Const ADS_PROPERTY_CLEAR = 1 Const ADS_PROPERTY_UPDATE = 2 Const ADS_PROPERTY_APPEND = 3 Const ADS_PROPERTY_DELETE = 4 Const E_ADS_PROPERTY_NOT_FOUND = &H8000500D Public Sub CreateLogEntry(ByVal strUserName As String) ‘Create the variables required for the log entry. Dim oUser As IADsUser Dim strLDAP As String Dim strLog As String On Error GoTo ErrorHandler
Creating a Simple Employee Check-In Application
Listing 6.3
185
Continued
‘Build a connection string. strLDAP = “LDAP://winserver/CN=” + _ strUserName + _ “,CN=Users,DC=DataCon,DC=com” ‘Get the user object from Active Directory. Set oUser = GetObject(strLDAP) oUser.GetInfo ‘Create a new log entry string. strLog = oUser.Get(“info”) + _ vbCrLf + “User Logged In: “ + _ Date$ + “ “ + Time$ ‘Enter the string in the Notes field. oUser.Put “info”, strLog oUser.SetInfo ‘Release the user object. Set oUser = Nothing Exit Sub ErrorHandler: If Err.Number = E_ADS_PROPERTY_NOT_FOUND Then ‘This is the first time the user has logged ‘in so we need to create a new entry. strLog = “User Login Times” + _ vbCrLf + “User Logged In: “ + _ Date$ + “ “ + Time$ ‘Enter the string in the Notes field. oUser.Put “info”, strLog oUser.SetInfo ‘Release the user object. Set oUser = Nothing Else ‘Make a log entry if an error occurs. App.LogEvent “Error Number: “ + CStr(Hex(Err.Number)) + vbCrLf + _ “Error Description: “ + Err.Description + vbCrLf + _ “Error Source: “ + Err.Source, _ vbLogEventTypeError End If End Sub Public Sub ClearLogEntries(ByVal strUserName As String) ‘Create the variables required for the log ‘deletion. Dim oUser As IADsUser Dim strLDAP As String
On Error GoTo ErrorHandler
Ch
6
186
Chapter 6
Creating Remote Access Utilities
Listing 6.3
Continued
‘Build a connection string. strLDAP = “LDAP://winserver/CN=” + _ strUserName + _ “,CN=Users,DC=DataCon,DC=com” ‘Get the user object from Active Directory. Set oUser = GetObject(strLDAP) oUser.GetInfo ‘Delete the string from the Notes field. oUser.PutEx ADS_PROPERTY_CLEAR, “info”, Hex(Null) oUser.SetInfo ‘Log an event so the network administrator knows ‘the entries were cleared. App.LogEvent strUserName + “ cleared all login events”, _ vbLogEventTypeInformation ‘Release the user object. Set oUser = Nothing Exit Sub ErrorHandler: ‘Make a log entry if an error occurs. App.LogEvent “Error Number: “ + CStr(Hex(Err.Number)) + vbCrLf + _ “Error Description: “ + Err.Description + vbCrLf + _ “Error Source: “ + Err.Source, _ vbLogEventTypeError End Sub
One of the more difficult tasks is building an LDAP string that we can use to access a particular resource, but using the ADSI Edit utility provided with the Windows 2000 Support Tools makes that task much easier. Figure 6.8 shows a hierarchical view of the domain I’m working with for this example. Note that you can find any Active Directory object and view its properties. Right-click a user name within the hierarchical list and select Properties. You’ll see a CN=User Properties dialog box like the one shown in Figure 6.9. Notice the Path entry at the top. It might look like you can do anything with this entry, but ADSI Edit allows you to select the entry and copy it by pressing Ctrl+C. You can paste the resulting string into your code and know for sure that the LDAP string is going to be correct.
While we’re in the CN=User Properties dialog box, notice the other fields. You can use this dialog box to explore the various user properties that Active Directory makes available. It’s one way to determine how to modify your applications to make best use of Active Directory.
Creating a Simple Employee Check-In Application
187
Figure 6.8 ADSI Edit allows you to locate Active Directory resources quickly.
Figure 6.9 Getting the correct LDAP path is easy if you know where to look.
Ch
6 You’ll instantiate oUser by using the GetObject() method. The GetObject() method requires an object reference in the form of a string, which is where the LDAP string comes into play. However, instantiating oUser isn’t enough to fill it with information we can use. The code also calls on the GetInfo() method to populate the object with data. Once we have an object with which to work, all we need to do is create a string to put into it. This requires two steps. First, we’ll use the Put() method to add the data to the local memory variable. Second, we’ll use SetInfo() to send the data to the server.
188
Chapter 6
Creating Remote Access Utilities
Notice the odd-looking error handler for this example. Active Directory treats all empty (Null) values as not existing. So, when we try to use the Put() method on an empty info (Notes) field, Active Directory reports an E_ADS_PROPERTY_NOT_FOUND error. The error handler looks for this return error and creates a special entry for it. In short, this error allows us to create a header for the Notes field. This example provides more in the way of error handling than previous examples. During testing, I found that the entire process of querying Active Directory is more error prone than many other activities. A small change in the Active Directory structure can wreak havoc with applications. Even a small error in the entry of the user name will result in an error. Unfortunately, all the client will see is a generic error message stating the server doesn’t support the requested action. Because the server likely supported this action yesterday, the user will be understandably confused (and the network administrator might be equally concerned). Maintaining good log entries is essential when working with Active Directory using SOAP. In this case, the event log entry will contain the error number, the description, and the source of the error. Note that the error number is in hexadecimal to make it easier to match to known errors. ADSI provides some extended methods that you’ll need to know about as well. The ClearLogEntries() method looks much the same as the CreateLogEntry() method, but it uses the PutEx() method as shown here to clear the entries from the Notes field: ‘Delete the string from the Notes field. oUser.PutEx ADS_PROPERTY_CLEAR, “info”, Hex(Null) oUser.SetInfo
Setting the field to a Null value alone won’t clear it. You need to specify the ADS_PROPERTY _CLEAR constant that the PutEx() method provides to actually clear the field.
Some Caveats About WSDL Files We’ve visited the topic of WSDL files frequently in the book, but it’s important to realize the role they play in efficient SOAP communication for applications that support them. One of the oddities of most WSDL generation utilities is that they always assume two-way communication between the client and server. What happens, however, when you need only one-way communication, as in this example? The component we’ve designed doesn’t provide a result value and the client isn’t expecting a return value. Figure 6.10 shows the output of the WSDL Generator utility that comes with the Microsoft SOAP Toolkit. I highlighted the request/response pair for the CreateLogEntry(). The ClearLogEntries() method has a similar pair of entries. If you generate this WSDL file using a product such as 4S4C, you’ll get a similar result. The WSDL file contains the request/response pair even though the return trip is wasted. One of the few WSDL generators that I found that will create a one-way communication entry is psWSDL Wizard (see Appendix C, “Third-Party Tool Reference,” for details on this product). Figure 6.11 shows a WSDL file generated using the product that allows
Creating a Simple Employee Check-In Application
189
one-way communication. Creating a WSDL file that permits one-way communication makes your application more efficient. In short, you’ll want to use a third-party product to create the WSDL file for some utility applications, rather than rely on the native capability of the toolkit installed on your system. Figure 6.10 Most WSDL generators create a request/ response pair even when you don’t need it.
Figure 6.11 The psWSDL Wizard helps you create complex WSDL files that improve application efficiency.
Ch
6
190
Chapter 6
Creating Remote Access Utilities
You’ll find a comparison of three WSDL generator outputs for this example in the Chapter 06\Active Directory Component directory of the source code available from the Que Web site. You can find the source code at http://www.quepublishing.com. It’s helpful to look at the output provided by different generators because of coding and efficiency concerns for SOAP applications. Of course, there are interoperability concerns to consider as well. You might find that a WSDL generator that normally provides good interoperability fails in a specific situation. Developers will continue having interoperability problems between platforms until the SOAP specification is complete.
Creating the Client The client application looks much like all of the other high-level API clients we’ve created to date. Listing 6.4 shows what this client looks like.
Listing 6.4 contains only the code for the cmdMakeEntry_Click() method. You’ll find the full client source code in the \Chapter 06\Remote Log Test directory of source code available from the Que Web site. You can find the source code at http://www. quepublishing.com. The \Chapter 06\Local Log Test directory contains an alternative client you can use to test the component before moving it to your server.
Listing 6.4
Remote Test Client Source Code
Private Sub cmdMakeEntry_Click() ‘Create the SOAP client. Dim Client As SoapClient Dim Result As String Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/UserLog/QCOne.WSDL”, _ “QCOne”, _ “UserLogSoapPort” ‘Make the log entry and report success. Client.CreateLogEntry txtUserName.Text MsgBox “Thanks for checking in!”, _ vbOKOnly Or vbInformation, _ “Check In Success” Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + Client.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + Client.faultstring + vbCrLf + vbCrLf + _
Creating a Simple Employee Check-In Application
Listing 6.4
191
Continued
“Fault Actor: “ + Client.faultactor + vbCrLf + vbCrLf + _ “Detail: “ + Client.detail MsgBox ErrorMessage, vbExclamation End Sub
Notice that the client call doesn’t expect a result in this case because the server won’t provide one. Using one-way calls saves time and resources. However, because the server doesn’t provide feedback, we’ll still need to display something for the user. In this case, a simple message suffices.
Testing the Application Normally, you can look at a utility program and quickly understand what the output is going to be. For example, the server information utility at the beginning of this chapter is relatively easy to figure out. You can guess that it will output a list of services and their current state. The utility in this section is a little different. It would be easy to extend this utility to perform all kinds of tasks in the background. All the user would need to do is click a single button to start the process. This lack of two-way communication means that you have to test such an application thoroughly. In addition, you’ll need to spend more time looking at potential problems. Let’s look at the output of the application first. After you start the application, click Make Entry twice. Open the User Properties dialog box for the user you selected and you should see a display similar to the one shown in Figure 6.12. Likewise, you’ll want to verify that the Clear Entry button removes the entries shown in Figure 6.12. Figure 6.12 The component will normally provide user login time entries in the Notes field. Ch
6
192
Chapter 6
Creating Remote Access Utilities
Let’s talk about the Make Entry button a bit more and answer the question of why you pressed the button twice. The server-side component went through two different processes to create these entries. Remember that the first time the application generated an error and handled it by creating the primary login entry. The second call didn’t generate an error because the Notes field had an entry in it. Next, you’ll want to enter an erroneous value for the user name. You should see two event log entries on the server. The first will state that the Active Server Pages service has stopped. The second will show the error entry from the component. Figure 6.13 shows a typical server-side error event log entry. Figure 6.13 Server-side event log entries will contain the information needed to troubleshoot the application.
The current version of the Microsoft SOAP Toolkit has an interesting problem. If you use the ASP method of accessing the component, you’ll see the event log entries as anticipated. On the other hand, using the ISAPI access method normally prevents the component from making event log entries. Microsoft is aware of the problem and will probably fix it before you read this book. However, it’s important to check for this issue so that you don’t rely on event log entries that will never appear. The sample component also makes an informational event log entry every time a user clears the log. It’s important to let the network administrator know when users are performing tasks that might not meet with company guidelines. You’d want to use this technique when company policy allows a user to perform a task under certain circumstances, but not under others. The client-side event log entry isn’t as informative as the one on the server, but it does appear every time an error occurs, no matter which access method you use. Figure 6.14 shows a typical example of a client-side event log entry. As you can see, it tells you that an error occurred and nothing more.
Project
193
Figure 6.14 Client-side event log entries are only useful in providing you with confirmation that an error occurred.
Project We’ve looked at two types of utility programs in this chapter. The first requests information from the server, while the second sends information to the server. Both utilities are small and perform a single task well. The following sections look at a few projects you might want to try based on the content of this chapter.
Creating Your Own Utility Many developers find that utility programs are the best way to learn new techniques because they’re small and easy to build. You can also target utilities at learning how to perform specific kinds of tasks. For example, I created an entire series of small utility programs when learning to work with Active Directory. The small programs allowed me to concentrate on one area at a time without getting overwhelmed. Vendors created SOAP to overcome many obstacles, one of which is the steep learning curve of other technologies. However, even SOAP takes a while to learn. Creating your own utility could be the way to learn how to perform certain types of tasks, such as dealing with userdefined types (an issue we discuss in Chapter 7, “Creating Data Entry Forms and Surveys”). Try creating one or two small utilities of your own that reflect the tasks for which you’ll use SOAP in your business. Here are some steps you can follow to develop these utilities quickly: 1. Determine which type of utility you want to create. 2. Decide whether the utility will actually use the features provided by SOAP. Make sure that SOAP isn’t overkill for the situation. 3. Define the interfaces for your utility. Look for ways to use simple types to transfer data. Also check for potential data translation problems, such as the XML parser’s stripping of linefeeds from the SOAP character stream.
Ch
6
194
Chapter 6
Creating Remote Access Utilities
4. Simplify complex problems so the utility you create will perform a single task well. 5. Create the component for your utility. Try various WSDL generators to see which one performs most efficiently. Look for obvious interoperability problems. 6. Code the client-side utility and test. Look at data output, event log entries, and other clues that your application works as anticipated.
Upgrading the Server State and User Check-In Utilities I kept both utilities in this chapter short to make the SOAP concepts I was trying to demonstrate easier to see. Both utilities do provide useful features and you might find that you want to expand their capability. This section contains some of the ideas that I’m going to try as I have time. The server state utility is handy if you want to check the state of the services on several different servers. The reason this utility is useful is that it’s more flexible and runs faster than the MMC snap-in. You could add enhancements, such as the ability to enter server names or to output the data to a file in a batch process. Because my test system has only one Windows 2000 server, I hardwired the utility for now. However, I’ll probably add the ability to choose a server later because my system does have other Windows servers and workstations on it. The current version of the utility shows only one piece of status information. Another way to upgrade the server state utility is to display more status information. You could even output all of this information to a text file and automatically parse it for trouble spots on your system. The user check-in utility is an idea in the rough. The idea isn’t new, but the technology behind it is. Any network administrator will tell you that automation for the road warriors isn’t just nice, it’s a must if you want them to actually get some work done while on the road. Of course, the best automation is the type that works automatically. The user check-in utility could update the user’s status information in Active Directory, upload hours worked, orders, or anything else your company needs to perform. One of the issues that SOAP developers are coping with now is asynchronous communication. This application is a perfect example of something that would benefit from asynchronous communication. A user could upload information to a queue on the server every time he checks in. Because the user isn’t expecting a reply, the server can work on the input as time permits. Hopefully, I’ve given you some ideas on how to upgrade the two utilities in this chapter. Here are some steps to follow when adding to the utilities. 1. Determine if the utilities will work as they are now. If not, decide what you’d like to see added. 2. Create a new project and add the code from the book utilities to it. That way your baseline code won’t disappear under a cloak of changes. 3. Add new features one piece at a time. 4. Perform local testing, then place the utility on the server and test with SOAP. 5. Contact me about your latest creation at
[email protected]. I’m always interested in hearing how people have used the code within my books. Knowing this information helps me create better examples in the future.
CHAPTER
Creating Data Entry Forms and Surveys In this chapter Determining Which Data Entry Vehicle to Use
197
Shortcuts for Data Entry and Survey Applications Creating a Simple Survey Form
199
208
Creating a Simple Data Entry Form Application
221
Security, Privacy, Performance, and Reliability Issues Handling Data Entry and Survey Form Errors Using Templates for Quick Forms
233
Using MIME for SOAP Applications Project
236
234
232
227
7
196
Chapter 7
Creating Data Entry Forms and Surveys
For many readers, the title of this chapter probably evokes visions of the simple online forms that businesses use to solicit customer opinions. Many Web sites today include places for users to interact with a company using surveys or other forms for data entry. Companies collect a vast amount of information using these online sites. In fact, the amount of information collected using data entry forms is so large that many companies can’t imagine conducting business without them. Obviously, data forms and surveys for interacting with customers are an important consideration. Of course, customer input isn’t the only use for data forms and surveys. Much of the information that companies collect is from sources that use a product or service that the company provides or at least supports. For example, today software vendors collect a lot of user information with registration forms transferred over the Internet. SOAP represents a way of making the process of collecting this information easier and less error prone. However, even these two forms of feedback are only the tip of the iceberg. Employees can also make support calls using online forms. Likewise, customers can request status information and partners can assess service potential using forms. SOAP also represents one of the best ways to support this two-way communication. This second form of communication is so important that Microsoft and other vendors are giving it a special name—Web Services. No matter what name you use or how you look at data entry forms and surveys, they still amount to a formalized method for exchanging data. The form enforces a certain level of continuity between requests and simplifies the software required to handle the data. In short, this chapter looks at methods for organized data handling that also interacts with a data storage technology, such as a database. This chapter will introduce you to some of the techniques you can use to make data entry forms easier for both the client and the server. These techniques also reduce development time and the burden on the developer. I won’t try to cover techniques that you can find in books on general programming, but will instead focus on techniques for online data handling. Of course, I’ll provide you with links to sources of information that you might want to use to get the general information not included with this chapter. The first section of this chapter, “Determining Which Data Entry Vehicle to Use,” introduces the concept of a data entry vehicle. It describes the forms of data entry and tells you how well each form will work in a given situation. By the end of this section, you’ll decide on a data entry form for your application and know why it’s the best choice. The second section of the chapter, “Shortcuts for Data Entry and Survey Applications,” provides you with some shortcuts you can use to make the development process a little easier. The formal nature of these applications means you can use some of the same techniques to create most applications—the content is different, but the procedure is the same. We’ll create two applications in the third (Creating a Simple Survey Form) and fourth (Creating a Simple Data Entry Form Application) sections of the chapter. The first example shows how to create a survey application. It provides one-way communication between client and server. This example will contain the first Web page processing in the book. The
Determining Which Data Entry Vehicle to Use
197
second example is a data entry application. This example shows you the mechanics of creating a SOAP message with attachments. This is just one of many ways to send complex data, but it might provide the best choice when working with certain types of database management applications. The fifth section, “Security, Privacy, Performance, and Reliability Issues,” addresses the four non-programming issues of critical importance to your data entry form or survey application. Security is a growing concern for all types of data. Likewise, privacy issues have received major press in the past few months. Ensuring you actually receive the data sent by the user in good condition is another concern. Finally, we’ll address the performance hit your server will receive when processing the data. In the sixth section of the chapter, “Handling Data Entry and Survey Form Errors,” we’ll discuss special error handling for form-based data. This error handling is in addition to the other types of error handling we’ve already discussed in the book. The seventh section of the chapter, “Using Templates for Quick Forms,” will talk about using templates to create forms. This is especially important because companies tend to change their forms often. Surveys run for a limited amount of time and vendors who use surveys are always devising new surveys to test consumer reactions. Using templates for form-based data is essential if you want to retain your sanity. The eighth section of the chapter, “Using MIME for SOAP Applications” discusses how MIME will eventually affect SOAP applications. Unfortunately, vendors haven’t implemented MIME support yet, even though it’s part of the specification. This section tells you what vendors plan to do in the future. It also tells how using MIME will change data entry for the better. Finally, the Project section will take you through some steps for creating your own data entry project. This section helps you put everything you’ve learned in the chapter into practice. This Project section allows you to experiment with the material you’ve just learned by working with a quick project.
Determining Which Data Entry Vehicle to Use Entering information into the computer is a time-consuming, but necessary task. However, some users end up entering the same data more than once because of the way vendors develop applications today. Even if you can copy and paste data from one application to another, you also have formatting and other issues to consider. In addition, application upgrades often wreak havoc with data entry. For example, there was an instance when Microsoft released a version of Word that was so incompatible with its predecessors that the product required a patch almost upon release. You can view data entry as two elements: the data and the formatting (or the view of the data). Data entry today is more a view of a particular piece of information than the data. Consider the simple word processor. Sure, the words are important, but how the application
Ch
7
198
Chapter 7
Creating Data Entry Forms and Surveys
presents those words onscreen is even more important in the minds of users. Word processor vendors provide all kinds of interesting ways to quickly format data and retain the formatting from session to session. Yet, the data entry process is about the same today as it was when I first started using computers many years ago. XML can free applications from worrying about the format of the data. This is such a powerful concept that companies such as Sun Microsystems are currently testing it in the real world. StarOffice is an open source productivity application that includes several common features such as a word processor, spreadsheet, presentation manager, and database. The StarOffice difference is that the data is stored in XML. This means that unlike products such as Microsoft Office, you should be able to move the data between platforms without problem. Sun plans to offer StarOffice on several platforms, including Windows, Solaris, and Linux. Sun eventually hopes to establish the StarOffice file format as a universal format for all productivity applications, which means we might finally free ourselves of inter-application compatibility problems. As part of this effort, Sun has released the draft specification to OpenOffice.org (http://xml.openoffice.org) for study by the open-source community. We’ll see in the “Working with SQLXML” section of Chapter 8, “Providing Remote Database Access,” that Microsoft is also working hard to move some of their data to XML (or at least make it accessible in that form). Of course, this begs the question of what this means to you as a developer. Choosing the correct data entry vehicle is becoming less important than the means used to store and transfer the data. Users are demanding more from their applications, and management wants to be sure that the data users enter today will remain viable tomorrow. Therefore, the question of which data entry vehicle to use is one of data storage and user ease of use. Consider, for a moment, the new kinds of applications that XML is allowing developers to create. Tim Bray, the co-inventor of XML, has launched a public Web site (http://map.net) that shows how you can visually plot information on the Internet. This launch is part of his new company, Antarti.ca. Tim calls the application Visual NET. It allows users to navigate the Internet visually, rather than use the old system of URLs. It relies on a special database setup that stores data in a common format yet allows the application to determine the means of presentation. This is one of the first promising Application Service Provider applications to appear on the market that breaks the old molds. Ultimately, protocols such as SOAP will allow even broader application sharing—users might not know precisely where their application executes in the future, and they won’t care because the data is there for them to use.
Vendors are creating new data storage and transfer standards based on XML on a daily basis. If you don’t see the technology that you need today, it will probably be along with the next vendor press release. The point is that while the world is embracing XML and by extension SOAP in a big way, the proliferation of standards means we’ll eventually have too many to support. Look for the usual technology shakeout in a year or so— the remaining standards are those that are truly useful.
Shortcuts for Data Entry and Survey Applications
199
This chapter looks at two of many data entry vehicles: the data entry form and the survey. Data entry forms typically provide a formatted means to enter data in a two-way relationship with the server, while surveys often provide more freedom in a one-way relationship. We’ll look at other data entry vehicles as the book progresses. It’s important to realize that all of these applications ultimately rely on XML as a data transfer methodology, but could rely on other techniques for data storage. It’s often hard to point out the exact moment that a transition occurs because historically, the people involved in the transition don’t understand what’s taking place. XML could become the next big transition for data storage and SOAP might become the most popular way of moving XML from one place to another. It’s clear, however, that we’re in some sort of transition brought on by the connectivity that the Internet provides. As a developer, you need to consider your data entry vehicle choices carefully. Not only do you need to think about the user’s needs, but you also need to consider how the data is ultimately stored on the server.
Shortcuts for Data Entry and Survey Applications Database applications, no matter how well designed, present challenges that you won’t find in any other application type. The fact that a database application is responsible for moving mission-critical data means there’s added complexity to consider. A database application has to manage that data for many users at the same time. In short, database applications are the most difficult to create and likely the type you’ll run into most often. The following sections won’t solve all of your database application problems. I could write entire books on just one database management system (DBMS), much less try to cover them all in one section of a chapter. These tips will help you with some of the problems you’ll experience working with databases under SOAP. We’ll discuss topics such as data conversion and methods to reduce the number of round trips you have to make.
Reducing the Number of Round Trips You’ve likely seen at least one database application where the developer uses several round trips to accomplish any given task. For example, a session might begin with the application requesting some bit of information to perform a lookup. After it performs the lookup, the application will send part of the data back to the client and ask for still more input. The client provides the additional input and the application sends back another part of the data picture. This seesaw action continues until the query is complete. Although the multiple-request approach might work on a LAN or WAN, it’s not going to work on the Internet. For one thing, SOAP is a one-way protocol that some vendors have extended into a request/response format. Another problem is bandwidth. Making more than one request is going to slow the application to the point that no one will want to use it.
Ch
7
200
Chapter 7
Creating Data Entry Forms and Surveys
Of course, this point actually comes down to a problem with design in most cases. You need to design the application so that it performs well with a single request and a single response. This means gathering all of the data needed to answer an entire query at one time. It might also mean separating detail pages so that you get the main data on one page and the details on another. Theoretically, this will allow some developers to get around the multiple request problem without many application design changes.
Choosing Document or RPC Style WSDL Files WSDL actually supports two formatting styles: document and remote procedure call (RPC). The RPC style is the most common by far. In fact, it’s the only choice you’ll have when using many WSDL generators. One of the few products that does give you a choice is the psWSDL Wizard (see Appendix C, “Third-Party Tool Reference,” for details). Figure 7.1 shows you the dialog box that allows you to choose between document and RPC style. Figure 7.1 Only a few WSDL generators allow you to choose between document and RPCstyle output.
The RPC style is most popular because it allows the greatest flexibility in sending data between client and server. You can’t use the document style to send abstract data types. This makes the document style inappropriate for some database applications. Figure 7.2 shows the RPC output for the ServerState.DLL example from Chapter 6, “Creating Remote Access Utilities.” (You can also compare the two styles by looking at the WSDL files in the \Chapter 07\Doc vs RPC directory of the source code available from the Que Web site. You can find the source code at http://www.quepublishing.com.) Now, compare Figure 7.2 to the document-style output in Figure 7.3. As you can see, the document style is less complex and consumes less space. There are fewer opportunities for interoperability problems when using the document style. In addition, the document style is more efficient. It allows you to transfer data using fewer resources.
Shortcuts for Data Entry and Survey Applications
201
Figure 7.2 The RPC style output for the ServerState.DLL allows complex data definitions.
Figure 7.3 The document style output for the ServerState.DLL provides efficient data transfer.
Ch
7
202
Chapter 7
Creating Data Entry Forms and Surveys
Don’t get the idea that the document style type is only for small amounts of data. Many developers prefer to use the document style message for large amounts of simple data. For example, a purchase order might contain only strings or a combination of strings and numbers. Transferring a large purchase order using the document style will consume fewer system resources and enhance performance. The choice of document versus RPC style is normally determined for you by the type of data you want to transfer. Always use the RCP style when transferring abstract data or calls that contain UDTs. However, in those situations when you do need to make choice, using the document style might be the best choice because of the performance boost it provides. The RPC and document styles differ in another way that will affect your decision. Most developers view the RPC style as easily adapted to scripting. You would use it in situations when you don’t know anything about the client in advance. The document style requires advance knowledge of the document format. In short, many developers view the document style as useful between business associates, but not between a business and multiple customers. The SOAP specification doesn’t necessarily define this viewpoint; developers base this opinion on experience with SOAP. A final consideration is that at least some developers view the document style as easier to document. The document style shows workflow more clearly, making the intended use easier to divine from the message alone. However, this is another situation where opinion and personal experience might figure more highly than fact.
Empty and NULL Value Processing One feature of database management programs is that not every field of every record will contain data. Some records will contain empty or NULL values. For example, it’s common to provide a second address line for client contact systems. Some clients will have a second address line, some won’t; therefore, this field will contain an empty string in some cases. Some developers will attempt to short-circuit message output for NULL and empty values by providing a single tag output for the value. Unfortunately, the following output won’t work in most cases:
Most XML parsers will simply ignore the Address2 value, resulting in odd application behavior. For example, some developers reported value shifting when they attempted to use this technique. The XML parser would fill the missing value with the next value in line. You can bypass this problem for empty values by providing an empty tag set like this:
Shortcuts for Data Entry and Survey Applications
203
Of course, you might need a method for differentiating between empty and NULL values. Unfortunately, the XML and SOAP specifications don’t provide a way to differentiate between the two. However, many developers now use the word “NULL” in all uppercase, as shown here, to differentiate between NULL and empty values. ”NULL”
Note that this entire issue is in a state of flux as I write this and a standardized solution might be available when you read this section. The method used to define empty and NULL values is important, and the lack of consistency is problematic. Some developers have reported getting different results for empty and NULL values by changing the product version. The point is that if you plan to transfer empty or NULL values, you should perform testing on every platform you plan to support. Otherwise, you might end up with many unexpected support calls that vary according to the product version in use.
Unregistering Your Control Database management applications seem to require more than the usual number of controls. Registering and unregistering these controls as you test them can become a painful experience because you need a DOS command prompt every time. I always keep a command prompt handy for just this purpose. You can make the process of creating a command prompt where you need it easier by adding a new entry to an existing file type in Windows. Locate the Folder File Type on the File Types tab of the Folder Options dialog box in Windows Explorer. Click Advanced. You’ll see an Edit File Type dialog box. Click New. You’ll see a New Action dialog box. Type Command Prompt Here in the Action field and cmd.exe /k “cd %1 in the Application Used to Perform Action field. Click OK three times to complete the action. Now, every time you need a command prompt, right-click the folder and choose Command Prompt Here from the context menu. Some people find that they don’t like using the command prompt to register and unregister their components using RegSvr32. You’ll find an alternative method in the Xteq COM Register Extension at http://www.xteq.com/downloads/index.html#comr. This utility adds two new menu entries to your system as shown in Figure 7.4. All you need to do is right-click the component and choose Unregister library to unregister it. Likewise, choosing Register library will add it to the registry. You get the same RegSvr32 messages as normal. The only difference is that you don’t have to go to the command line to perform the task. Ch
7
204
Chapter 7
Creating Data Entry Forms and Surveys
Figure 7.4 The COM Register Extension adds new menu entries to your system.
Understanding CDATA Sections XML supports the concept of a CDATA (or character data) section within messages. You can use a CDATA section to prevent interpretation of part of the message by the XML parser. For example, this is one way to get around the problems with transferring carriage return/linefeed pairs in a message. This sounds like a promising fix for many problems that developers experience when using SOAP. Using a CDATA section is easy. All you need to do is enclose the CDATA as shown here:
Unfortunately, CDATA sections aren’t the cure-all that developers think they are. Many developers experience problems using CDATA sections because they don’t understand the restrictions of using them. For example, you can’t place the ]]> character sequence anywhere in the section for the obvious reasons—the XML parser will think the CDATA section has added and you’ll get unpredictable results. This means that you can’t use a CDATA section to transfer an XML document because you can’t be sure the XML document won’t include the ]]> character sequence. The CDATA sections are limited to clean character data—data that is primarily text without control information. Although you can transmit functions using CDATA sections, you must ensure that the function won’t contain data that might cause problems within the CDATA section. CDATA sections are a valuable asset for the developer. However, you’ll want to ensure that you understand the limitations of using them. Attempting to use CDATA sections when you need to use Base64 encryption or attachments will definitely cause crashes and unpredictable application behavior.
Understanding XML Document Transmission Restrictions Someone has said that the odd thing about SOAP is that it’s XML technology to the fullest, but it can’t easily transmit an XML document. You read in the previous section that you
Shortcuts for Data Entry and Survey Applications
205
can’t use CDATA sections to transmit XML documents because you can’t be sure the document will be free of unwanted character sequences. Not only do you have to consider character sequences in XML documents, but you also have to consider its encoding. Your SOAP application might rely on UTF-8 for example, whereas the XML document you want to transfer relies on UTF-16. The difference in encoding will cause your application to crash or at least act unpredictably. One way to overcome this problem is to transfer the XML document as an attachment. Unfortunately, none of the presently available SOAP toolkits supports attachments. If you want to use attachments, you have to build your own functionality from scratch, which is a daunting task for the best programmer. Another way to overcome the problem of transferring XML documents is to use Base64 encoding. This is the current method used to transfer binary data. It works well, but exacts a performance penalty. Base64-encoded data requires more space; therefore, it requires more time and resources to transfer. Using Base64 encoding on large documents might prove prohibitive from a performance perspective. Some developers also worry about compatibility problems when using Base64, although no one has reported major problems in this area. Some developers get around the problem by using escape sequences in place of problem markup characters. The document object model (DOM) parser at one end converts problematic character sequences to escape sequences, whereas the DOM parser on the other end converts the escape sequences to markup language character sequences. This method still exacts a performance penalty, but not as great as using Base64 encoding. However, using this method restricts you to using the DOM parser. Many developers prefer using Simple API for XML (SAX) parsers because they provide a performance boost. In addition, using escape sequences means you must examine the code carefully to ensure it catches all of the markup language characters. Some XML parsers will automatically handle escape character sequences for you when used in DOM mode. This saves the developer time coding what should be a standard feature. However, you also need to verify that the XML parser provides consistent escape character sequence handling. For example, some developers report that the Microsoft XML parser (MSXML) version 3.0 handles the vertical tab character (ASCII decimal 11) unpredictably. In some cases, it escapes the character; in other cases, it doesn’t. Fortunately, the parser will register an error to tell you about the problem character. The reason that I mention the vertical tab is that it’s an illegal character according to the specification (http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char), so MSXML should always fail or at least strip the character out of the data stream. The point is that you want to look for an XML parser that provides predictable behavior if you plan to use it for adding escape character sequences automatically.
Ch
7 The bottom line is that no perfect solutions to this problem exist. You need to choose the solution that causes the least number of problems in your program environment, and then find ways around the remaining problems. For example, you could performance tune your application to reduce the effects of Base64 encoding performance penalties.
206
Chapter 7
Creating Data Entry Forms and Surveys
Using a Third-Party Product to Document Your Components Of all the applications we’ll create in the book, it’s most likely that someone will need access to your components when working with survey or data entry applications. A partner might require access to these components to incorporate them into an application that accesses resources on your site. For example, ordering parts might consist of creating a data entry form that sends the information directly to your company. Many products on the market help you build documentation for your components. However, most are difficult to use and many are quite expensive. DocBuilder is a freeware product created by GFaI. The terminology has a definite German slant, but you can easily modify it for use with English by changing the configuration files. GFaI supplies well-written English language documentation, and I found that modifying the scripts was relatively easy using the tips in the frequently asked questions section of the help file.
We’ve discussed the importance of documenting your components in situations where a third party might need to use them. DocBuilder (http://www.gfai.de/produkte/ docbuilder/e_index.htm) relies on scripts to search your application and create documentation automatically. VBdocman (http://vbdocman.exit.mytoday.de/), a shareware documentation product, creates help files for your Visual Basic applications in HTML, RTF, and Windows Help formats. You can also create user-defined output formats, such as XML, with this product. As with DocBuilder, you don’t have to add comments manually to use this product, but it helps. The quality of the documentation you receive is directly proportional to the number of tags added to the documentation. Unlike some products on the market, VBdocman will add the tags to the document for you, but you still have to add descriptive text. The generated file can include see also links, example links, objects, methods, property or event descriptions, parameter descriptions, setting descriptions, return values, and remarks. You’ll find an example of the output for VBdocman in the \Chapter 07\VBdocman directory of the source code available from the Que Web site. You can find the source code at http://www.quepublishing.com.
The product in this section works only with C++, Pascal, and Delphi at the time of this writing. However, the author, Dr. Josef Richardt, intends to add Visual Basic support in the future. You can also add the specially formatted comments to a Visual Basic file and still use DocBuilder with it; the product won’t work with an uncommented Visual Basic file. Consequently, the example will show a Visual C++ example and not a Visual Basic example. Testing shows that DocBuilder works equally well with Borland C++ and Delphi.
This section shows how to use DocBuilder with the ServInfo component in Appendix D, “SOAP for Visual C++ Developers.” Using this product is easy. You’ll want to move one of the script files to the project directory from the \Program Files\DocBuilder\examples\dbs. Create a new project using the File | New Project command. Define the project and output directories, along with the script file that you copied to the project directory. Figure 7.5 shows a typical New Project dialog box.
Shortcuts for Data Entry and Survey Applications
207
Figure 7.5 Typical DocBuilder New Project dialog box containing directory and script information.
Notice the Define Input Files button at the bottom of the dialog box in Figure 7.5. Click this button to add files to your project. You can add the files later, but it’s more convenient to add them now. In this case, I’ll add the ViewServer files from the ServInfo component because these files contain my class descriptions. Figure 7.6 shows how the Input Files dialog box looks after you add files to it. As you can see, DocBuilder allows you to create complex setups consisting of multiple components. Click OK to close this dialog box, and click OK again to close the New Project dialog box. Figure 7.6 Use the Input Files dialog box to add class files to your project.
At this point, you’ll see the main project window. Most of the menus act as you’d expect. You can use the Definitions menu to redefine elements of your project such as the script used to create output. You’ll also use this menu to title your document and to modify help file attributes. (I found the defaults worked fine for the components that I tested.) After you have the project defined, use the options on the Analysis menu to check your setup. You must perform an analysis before you can create output files. The analysis process will show any errors in your project files that prevent DocBuilder from creating documentation for your component.
Ch
7
208
Chapter 7
Creating Data Entry Forms and Surveys
The final step is creating output. You have a choice of HTML, Windows Help, or rich text format (RTF) files. I used the Windows Help file format for the example. Figure 7.7 shows a typical Windows Help table of contents for this product when working with a simple component. Notice that you can obtain a full description of almost every element of the component. Of course, this is a simple component, so there was little information to garner, but tests on complex components show that the product scales well. Figure 7.7 Windows Help contents output from DocBuilder.
Figure 7.8 shows the type of documentation you can expect using an unmodified code file. Adding comments to your source code will improve the documentation produced. Notice the German language title bar entry. You can change these strings to English or any other language by modifying the output files. For example, you’d modify the help project (HPJ) file for a Windows Help project.
Creating a Simple Survey Form Companies use survey forms for a variety of purposes. The most common use is to ask customers how they feel about a product and how they use it. In some cases, a company might ask for customer referrals or provide an opportunity for the customer to ask for additional information about another product. Survey forms also figure prominently in employee relations. The suggestion form is nothing more than a survey in disguise. From a development perspective, many forms follow the survey format. A good way to look at surveys is as data entry forms that require a one-way communication. The form requests data in the form of an opinion and the sender knows there won’t be any immediate response (except a simple thank you message at the end). Most survey forms include two question types. The first requests a definite response, such as rating the usability
Creating a Simple Survey Form
209
of a product for a specific purpose. The second is an essay answer where the respondent types an answer. In short, you can reduce survey data to integers and strings, making it easy to transfer the data using any method. Figure 7.8 A sample of an individual document page from DocBuilder.
Some developers will work with more than one company, each with its own set of components. This leads to using more than one WSDL file. A simple fact to remember is that a SOAP client can handle only one WSDL file. You can combine multiple SOAP clients in one application, but you must maintain separate clients for each WSDL file.
Surveys don’t provide much value if you don’t analyze the data they provide. Some companies place the information in a spreadsheet and perform “what if” analysis for days at a time. This method works best with complex forms that request a lot of input of various types. Other companies create an application to perform the required analysis and output the results to anyone within the company who needs the information. This method works best with simple forms where the company is expecting a specific type of information output. The following sections show you how to create a simple survey form that stores its results in a database. After you have the input application created, the sections that follow will show how to create an output application. Your company will require both application types to use survey data.
Ch
7
210
Chapter 7
Creating Data Entry Forms and Surveys
Creating the Database The database for this example is extremely simple. It contains a combination of integer and string fields that track user survey input. The example won’t contain any odd data formats to translate or unexpected surprises from a data translation perspective. (See Chapter 8 for examples of data conversion problems.) Figure 7.9 shows the structure of this database. Figure 7.9 A diagram of the simple database used in this example.
This example (and the one in the following section) relies on Microsoft SQL Server 7.0. The scripts found in the \Chapter 07\Data directory of the source code available from the Que Web site allow you to re-create the database within SQL Server. You can find the source code at http://www.quepublishing.com The directory also contains a backup of the data so that you don’t have to enter it manually.
As you can see, the survey will ask questions about the food likes and dislikes of the people surveyed. The database uses the most efficient method possible to store these likes and dislikes in an anonymous manner. Given privacy concerns today, many companies will resort to the anonymous survey when all they need is the information the respondent can provide. Of course, we still need some way to identify the individual entries, and the survey form doesn’t provide a means to guarantee uniqueness. The EntryNumber field contains a simple number that uniquely identifies each record. SQL server will update this field automatically. Notice that this is the only field used for the primary key. This example won’t cover some obvious additions to this database, such as individual field indexes to allow fast data searches. The only intent of this example is to show the SOAP connection. You’d normally add filtering and indexing to reduce the data returned to the user and the amount of processing time required.
Creating the Server-Side Component The server-side component accepts input from the survey form and places it in the database. Depending on the clients that you intend to use, the server-side component could also perform data analysis to ensure the input data meets specific value range and format criteria. However, this example will perform the range checking on the client—the option you should choose in most cases.
Creating a Simple Survey Form
211
Performing value range, data type, and format checks on the client using scripts reduces processing load in several ways. First, it allows the user to receive instant feedback on entries as they’re made. This reduces user confusion and makes it more likely the data will appear in the correct format the second time the user enters it. Second, the server doesn’t have to perform the analysis. The server-side component can assume the data is in the correct format before it arrives. Finally, every verification step requires a call to the server if you use server-side processing. Using client-side processing reduces application bandwidth requirements and speeds entry. Now that you understand the coding basis for this component, let’s look at some code. Listing 7.1 shows the component source code. Notice the component contains only the input code. The output code (the thank you note) resides on the client in script form, as we’ll see later.
Listing 7.1
Data Survey Input Component Source Code
Public Sub EnterFoodResults(ByVal ByVal ByVal ByVal ByVal ByVal ByVal ByVal ByVal
LikeMeat As Integer, _ LikeVegetable As Integer, _ LikeFruit As Integer, _ LikeGrain As Integer, _ DislikeMeat As Integer, _ DislikeVegetable As Integer, _ DislikeFruit As Integer, _ DislikeGrain As Integer, _ Comments As String)
Dim Conn As ADODB.Connection Dim Recc As ADODB.Recordset Dim ConnStr As String
‘Database connection. ‘Recordset ‘Database connection string.
‘Create a connection string. ConnStr = “Provider=sqloledb;” & _ “Data Source=WinServer;” & _ “Initial Catalog=Survey;” & _ “User Id=sa;Password=; “ ‘Connect to the database. Set Conn = New ADODB.Connection Conn.Open ConnStr ‘Obtain the required recordset. Set Recc = New ADODB.Recordset Recc.Open “Food”, _ Conn, _ adOpenStatic, _ adLockOptimistic, _ adCmdTable ‘Enter the data. Recc.AddNew Recc!LikeMeat = LikeMeat Recc!LikeVegetable = LikeVegetable Recc!LikeFruit = LikeFruit Recc!LikeGrain = LikeGrain
Ch
7
212
Chapter 7
Creating Data Entry Forms and Surveys
Listing 7.1
Continued
Recc!DislikeMeat = DislikeMeat Recc!DislikeVegetable = DislikeVegetable Recc!DislikeFruit = DislikeFruit Recc!DislikeGrain = DislikeGrain Recc!Comments = Comments Recc.Update ‘Close the recordset and database. Recc.Close Conn.Close ‘Clear the variables. Set Recc = Nothing Set Conn = Nothing End Sub
As you can see, this is essentially a data access component that adds new records. You don’t need to provide any other capability within this component because the application won’t allow the user to modify his input after he submits the survey. In fact, it might be counterproductive to add other features because this would only open the door to unwanted cracker activity—keeping the capability of the component limited is one form of security precaution. The component uses the following five-step process to submit a survey: 1. Create a connection to the database. 2. Open the recordset. In this case, we’ll use a table because the code is simply adding a record. 3. Create a new record. 4. Add content to the new record, and then update the record on the database. (The new data won’t appear on the server unless you use the Update() method.) 5. Close the recordset and the connection to the database. If you fail to close the database connections and free the object resources, the application will leak resources and you’ll eventually need to reboot the server. Notice that the example uses “sa” as the user ID and that there isn’t a password. Normally, you’d use a custom user ID and password within the code. Make sure you keep the database secure by providing an account that can only add records for your surveys. This code addition works with the component limitations to reduce your risk in allowing unknown third parties to add survey information to a database.
Designing a Survey Form You can use any of a number of survey form input types. The example uses a form-based application, but you could easily use a Web page or other variations. The client is providing input alone; the client doesn’t care about any data the server might return, so anything that can generate a SOAP message as output will work fine in this case. Listing 7.2 shows the client source code for this example.
Creating a Simple Survey Form
Listing 7.2
213
Data Survey Input Client Source Code
Private Sub cmdQuit_Click() ‘Exit the application. End End Sub Private Sub cmdSubmit_Click() ‘Create the SOAP client. Dim FoodSurvey As SoapClient Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set FoodSurvey = New SoapClient FoodSurvey.mssoapinit _ “http://winserver/soapexamples/Survey/Survey.WSDL”, _ “Survey”, _ “FoodSoapPort” ‘Enter the survey data. FoodSurvey.EnterFoodResults LikeMeat.ListIndex, _ LikeVegetable.ListIndex, _ LikeFruit.ListIndex, _ LikeGrain.ListIndex, _ DislikeMeat.ListIndex, _ DislikeVegetable.ListIndex, _ DislikeFruit.ListIndex, _ DislikeGrain.ListIndex, _ Comments.Text ‘Release the object. Set FoodSurvey = Nothing ‘Display a success message. MsgBox “Thank you for completing the survey!”, _ vbInformation & vbOKOnly, _ “Survey Entered Successfully” ‘We’re finished, so exit. Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + FoodSurvey.faultcode + _ vbCrLf + vbCrLf + _ “Fault String: “ + FoodSurvey.faultstring + _ vbCrLf + vbCrLf + _ “Fault Actor: “ + FoodSurvey.faultactor + _ vbCrLf + vbCrLf + _ “Detail: “ + FoodSurvey.detail MsgBox ErrorMessage, vbExclamation End Sub
Ch
7
214
Chapter 7
Creating Data Entry Forms and Surveys
Listing 7.2
Continued
Private Sub Form_Load() ‘Set the combo boxes to known values. LikeMeat.ListIndex = 0 LikeVegetable.ListIndex = 0 LikeFruit.ListIndex = 0 LikeGrain.ListIndex = 0 DislikeMeat.ListIndex = 0 DislikeVegetable.ListIndex = 0 DislikeFruit.ListIndex = 0 DislikeGrain.ListIndex = 0 End Sub
As you can see, the client creates a SOAP client, outputs the data, displays a success message, and exits. Notice the success message isn’t based on server feedback—it only acknowledges the user’s participation in the survey. It’s unlikely the user will participate in the survey a second time even if the client fails to deliver the requested data, so acknowledging the input is about all you can do. Notice this example uses a detailed error message. The application could be located in a kiosk or other remote location where the person monitoring the application has limited skills. If the application fails, you want to create the most detailed feedback message possible so the person reporting the problem doesn’t have to fiddle with the application much. In fact, the error reporting for this type of application should contain as much detail as you can provide. Of course, thorough testing will limit the number of times you actually need to rely on this information to fix an errant application.
Testing the Input Application As with any other application, it’s important to test this one thoroughly. Of course, one of the first tasks you’ll perform is to move the component to the server, register it, and then construct a WSDL file. It’s especially important to check the WSDL output in this case. This is a survey form, so you don’t want the component to pass information back to the client. This is a one-way trip as far as the client is concerned. Figure 7.10 shows the WSDL file for this example. Notice there isn’t any feedback—not even a Result variable (as you’d receive when using a function within Visual Basic). The entry signifies that this is an empty tag. Contrast this with the Food.EnterFoodResults tag, which contains elements for all of the input values. Figure 7.11 shows the main dialog box for this application. Notice that it uses combo boxes that don’t allow any kind of input. This reduces the chance that the user could enter a wrong value because he can only select from a list of predefined values. The test box is also set to limit the user’s input to 100 characters. The user is inputting a comment, in this case, so checking the content of the comment is unnecessary except for ensuring it isn’t too long.
Creating a Simple Survey Form
215
Figure 7.10 Always verify that the WSDL file provides nothing in the way of a return value when working with surveys.
Figure 7.11 Careful application design will limit the amount of error checking code required in your application.
These careful design choices are why the client code in Listing 7.2 lacks any form of input error check. Depending on your application, you can significantly reduce or at least limit the amount of input checking code you need to create. Using this technique reduces the chance that the application will fail because there are fewer failure points to consider. Input checking is essential for this kind of application because the user might not have much in the way of computer skills. Testing the client application is relatively simple. Add some values to the application, and then click Submit. You should see the success message shown in Figure 7.12. Of course, this message only means that the client application works. It doesn’t signify that the data actually arrived at the server or that the server-side component entered data into the database. The lack of a SOAP failure message only means that SOAP sent the data to the server because there’s no return value to test.
Ch
7
216
Chapter 7
Creating Data Entry Forms and Surveys
Figure 7.12 This simple success message shows that the client application works.
As part of application testing, I normally design a dialog box-based application with a grid of the data values. You’ll find this application in the \Chapter 07\Data Edit directory of the source code available from the Que Web site. You can find the source code at http://www. quepublishing.com Figure 7.13 shows what this application looks like. You’ll want to try all of the values that the program can accept and then try various combinations. Nine or ten successful tests normally indicate the application is ready for use with a simple application. Complex applications, especially those with many fill-in-the-blank answers, require more testing. Figure 7.13 Create test applications that allow you to view the results of your survey client.
Make sure you erase any test data before you release the application to production use. At least one developer recently left the test data in place by accident. The client conducted the survey and began compiling the results. The list of accidentally included test messages skewed the data and invalidated the survey results. The company was able to recover by eliminating the test messages, but the developer lost a customer in the process. In short, make sure your testing phase includes cleaning up afterward.
Writing an Analysis Component Surveys are useless if you can’t analyze their content and create a report. We used SOAP for the survey because the input application could appear anywhere. Using SOAP allows the survey application to send data to a remote database. Because surveys are usually one-time
Creating a Simple Survey Form
217
applications, many companies will use an outside consultant to create the survey and provide analysis for the results. Therefore, it’s quite likely that you’ll need to use SOAP for the analysis application as well. The survey components and results reside on your local server. The client will want to access these results as often as needed, so using SOAP to transfer data to a client application on their local workstation makes sense. Listing 7.3 shows the analysis component for this example.
Listing 7.3
Analysis Component for Food Survey Source Code
Public Function Dim Conn As Dim Recc As Dim ConnStr Dim Counter
GetFoodData() As Integer() ADODB.Connection ‘Database connection. ADODB.Recordset ‘Recordset As String ‘Database connection string. As Integer ‘Loop counter variable.
‘Create a connection string. ConnStr = “Provider=sqloledb;” & _ “Data Source=WinServer;” & _ “Initial Catalog=Survey;” & _ “User Id=sa;Password=; “ ‘Connect to the database. Set Conn = New ADODB.Connection Conn.Open ConnStr ‘Obtain the required recordset. Set Recc = New ADODB.Recordset Recc.Open “Food”, _ Conn, _ adOpenStatic, _ adLockOptimistic, _ adCmdTable ‘Create an array to hold the results. The database holds ‘8 Questions * 5 Answers Each, so we need 40 elements. Dim MyData() As Integer ReDim MyData(39) ‘Zero the array elements. For Counter = 0 To 39 MyData(Counter) = 0 Next ‘Loop through the database and count the entries. Do While Not Recc.EOF MyData(Recc!LikeMeat) = MyData(Recc!LikeMeat) + 1 MyData(Recc!LikeVegetable + 5) = MyData(Recc!LikeVegetable + 5) MyData(Recc!LikeFruit + 10) = MyData(Recc!LikeFruit + 10) + 1 MyData(Recc!LikeGrain + 15) = MyData(Recc!LikeGrain + 15) + 1 MyData(Recc!DislikeMeat + 20) = MyData(Recc!DislikeMeat + 20) + MyData(Recc!DislikeVegetable + 25) = _ MyData(Recc!DislikeVegetable + 25) + 1 MyData(Recc!DislikeFruit + 30) = MyData(Recc!DislikeFruit + 30) MyData(Recc!DislikeGrain + 35) = MyData(Recc!DislikeGrain + 35) Recc.MoveNext
+ 1
1
+ 1 + 1
Ch
7
218
Chapter 7
Creating Data Entry Forms and Surveys
Listing 7.3
Continued
Loop ‘Return the array of raw data. GetFoodData = MyData ‘Close the recordset and database. Recc.Close Conn.Close ‘Clear the variables. Set Recc = Nothing Set Conn = Nothing End Function
This component uses the same process for opening and closing both the connection and the recordset, so we won’t discuss those issues again. Notice that this component outputs the data to an array of integers. This technique wouldn’t work very well if the user wants to see the comments as well. However, because this component is creating a summary of the data, using an array works fine. (We’ll see later that there are caveats when using arrays; compatibility problems abound.) The component creates a dynamic array, dimensions it for the task at hand, and zeroes every entry in the array. Notice that there are 8 questions with 5 answers each. This means we need 40 array elements (or 39 in a zero-based array). The component directly manipulates the temporary array and adds the database entries using an index offset. It returns the raw data to the client for further analysis. You might wonder why the example doesn’t use a multidimensional array. Most SOAP implementations don’t support multidimensional arrays, but many will support a single dimension array. Using a single dimension does flatten the data and make it more difficult to work with, but at least you can transfer it from client to server. None of the vendors that produce SOAP toolkits intend to support multidimensional arrays as of this writing. Theoretically, you can perform full data analysis anywhere in the application. Returning raw data to the client, in this case, allows for local analysis. This means fewer trips to the server over a slow connection. In addition, the customer will use local resources for the final analysis, rather than rely on your server’s resources. The idea is to reduce bandwidth usage as much as possible to ensure good application performance despite any latency that SOAP might introduce.
Designing an Output Application The survey analysis client will accept the raw data, perform any additional analysis, and present it to the user. In this case, the survey client merely converts the raw input to percentages and presents both to the user. Listing 7.4 shows the source code for the analysis client.
Creating a Simple Survey Form
Listing 7.4 Private Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim
219
Analysis Client for Food Survey Source Code
Sub cmdGetSurvey_Click() MyArray() As Integer Counter As Integer LikeMeatTotal As Integer LikeVegTotal As Integer LikeFruitTotal As Integer LikeGrainTotal As Integer DislikeMeatTotal As Integer DislikeVegTotal As Integer DislikeFruitTotal As Integer DislikeGrainTotal As Integer
‘Data array holding raw data. ‘Loop counter variable. ‘Total LikeMeat entries. ‘Total LikeVegetable entries. ‘Total LikeFruit entries. ‘Total LikeGrain entries. ‘Total DislikeMeat entries. ‘Total DislikeVegetable entries. ‘Total DislikeFruit entries. ‘Total DislikeGrain entries.
‘Create the SOAP client. Dim FoodSurvey As SoapClient Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set FoodSurvey = New SoapClient FoodSurvey.mssoapinit _ “http://winserver/soapexamples/SurveyAnalysis/SurveyArray.WSDL”, _ “SurveyArray”, _ “FoodSoapPort” ‘Fill the data array with the current statistics. MyArray = FoodSurvey.GetFoodData() ‘Release the object. Set FoodSurvey = Nothing ‘Zero the total variables. LikeMeatTotal = 0 LikeVegTotal = 0 LikeFruitTotal = 0 LikeGrainTotal = 0 DislikeMeatTotal = 0 DislikeVegTotal = 0 DislikeFruitTotal = 0 DislikeGrainTotal = 0 ‘Display the raw data on screen. For Counter = 0 To 4 LikeMeat(Counter).Caption = CStr(MyArray(Counter)) LikeMeatTotal = LikeMeatTotal + MyArray(Counter) LikeVegetable(Counter).Caption = CStr(MyArray(Counter + 5)) LikeVegTotal = LikeVegTotal + MyArray(Counter) LikeFruit(Counter).Caption = CStr(MyArray(Counter + 10)) LikeFruitTotal = LikeFruitTotal + MyArray(Counter) LikeGrain(Counter).Caption = CStr(MyArray(Counter + 15)) LikeGrainTotal = LikeGrainTotal + MyArray(Counter)
Ch
7
220
Chapter 7
Creating Data Entry Forms and Surveys
Listing 7.4
Continued
DislikeMeat(Counter).Caption = CStr(MyArray(Counter + 20)) DislikeMeatTotal = DislikeMeatTotal + MyArray(Counter) DislikeVegetable(Counter).Caption = CStr(MyArray(Counter + 25)) DislikeVegTotal = DislikeVegTotal + MyArray(Counter) DislikeFruit(Counter).Caption = CStr(MyArray(Counter + 30)) DislikeFruitTotal = DislikeFruitTotal + MyArray(Counter) DislikeGrain(Counter).Caption = CStr(MyArray(Counter + 35)) DislikeGrainTotal = DislikeGrainTotal + MyArray(Counter) Next ‘Display the percentage data on screen. For Counter = 0 To 4 LikeMeatPct(Counter).Caption = _ Left(CStr(MyArray(Counter) / LikeMeatTotal * 100), 4) & “%” LikeVegetablePct(Counter).Caption = _ Left(CStr(MyArray(Counter + 5) / LikeVegTotal * 100), 4) & “%” LikeFruitPct(Counter).Caption = _ Left(CStr(MyArray(Counter + 10) / LikeFruitTotal * 100), 4) & “%” LikeGrainPct(Counter).Caption = _ Left(CStr(MyArray(Counter + 15) / LikeGrainTotal * 100), 4) & “%” DislikeMeatPct(Counter).Caption = _ Left(CStr(MyArray(Counter + 20) / DislikeMeatTotal * 100), 4) & “%” DislikeVegetablePct(Counter).Caption = _ Left(CStr(MyArray(Counter + 25) / DislikeVegTotal * 100), 4) & “%” DislikeFruitPct(Counter).Caption = _ Left(CStr(MyArray(Counter + 30) / DislikeFruitTotal * 100), 4) & “%” DislikeGrainPct(Counter).Caption = _ Left(CStr(MyArray(Counter + 35) / DislikeGrainTotal * 100), 4) & “%” Next ‘We’re finished, so exit. Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + FoodSurvey.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + FoodSurvey.faultstring + vbCrLf + vbCrLf + _ “Fault Actor: “ + FoodSurvey.faultactor + vbCrLf + vbCrLf + _ “Detail: “ + FoodSurvey.detail MsgBox ErrorMessage, vbExclamation End Sub Private Sub cmdQuit_Click() ‘Exit the application End End Sub
As you can see, the client creates a SOAP client, makes the method call, and releases the SOAP client object. Most of the examples we’ve had so far have used the data returned from the method call immediately. This application performs analysis on the data before presenting all of it, so it’s important to release the SOAP client immediately to free resources.
Creating a Simple Data Entry Form Application
221
The client provides two display routines. The first unfolds the array and presents the raw data onscreen. The second uses totals created during the raw data display to calculate data percentages. It again unfolds the data, performs the calculation, and displays the result. A multidimensional array would have made this process easier, but the flat presentation that the single dimension array provides doesn’t add much processing to the code in this case.
Survey Application Data Processing Concerns If you compare the size of this data transfer to a parse string, you’ll find that the string consumes fewer resources, but requires more client-side code. The reason for the difference is that SOAP sends the string as a single entity, but uses XML elements to transfer each array element separately. Surveys tend to produce large amounts of data, so you might find that you need to use the string presentation to obtain acceptable application performance. This and other database clients might experience another problem when using the Microsoft SOAP Toolkit (or other toolkits that rely on IIS). The example uses Internet Server Application Programming Interface (ISAPI) instead of Active Server Page (ASP). ISAPI has limits on the size message it can transfer. The Microsoft SOAP Toolkit automatically sets this limit during installation to prevent denial of service (DOS) attacks by crackers. You can change it using a registry setting, HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSOAP\SOAPISAP. This key has three values: MaxPostSize, NumThreads, and ObjCachedPerThread. Changing MaxPostSize will allow you to send larger messages using ISAPI.
Testing the Output One of the problems with using automatic code generators is that they don’t always produce the best code. Our example uses an array that will normally work with any Microsoft SOAP Toolkit client. However, you need to know that you might encounter compatibility problems when using a WSDL file created with the WSDL Generator utility that ships with the Microsoft SOAP Toolkit. Figure 7.14 shows the WSDL output of the SurveyAnalysis component. Notice that the array dimensions are undefined. This will cause problems for some SOAP implementations. You can hand edit the WSDL file to add dimensions as necessary, or you can use a different WSDL generator. The lack of dimensions won’t cause a problem with Microsoft clients. The analysis client should run relatively quickly. It uses small data transfers to provide the server with all of the survey data. Figure 7.15 shows the output from this application.
Creating a Simple Data Entry Form Application The data entry form application covers a lot of ground. For example, a survey is a type of data entry form, albeit limited in scope. Some people also include full-fledged database applications (discussed in Chapter 8) in this mix. However, there’s a middle ground application that better defines the simple data entry form application. It’s the kind of application where you expect two-way communication, but the communication is limited.
Ch
7
222
Chapter 7
Creating Data Entry Forms and Surveys
Figure 7.14 Look out for compatibility problems when working with arrays.
Figure 7.15 The output from the analysis client shows all of the statistics for the food survey.
Creating a Simple Data Entry Form Application
223
In this section, we’ll look at a data entry form where the user enters information and receives a response based on his input. The user enters data, but can’t modify the input in any way. The conversation is two-way because the server provides a response based on the user input. The following sections will show you how to put this example together.
Creating the Database This example uses two databases. The first appears in Figure 7.16. It contains the user’s input. Notice that the table uses a primary key of the user’s first and last name to ensure there aren’t any duplicates. You’d need to make the primary key more distinct in a realworld application because many people have the same first and last name. Figure 7.16 The user input is limited to name, age, sex, a like, and a dislike.
The second table appears in Figure 7.17. It contains two fields. The first is a number used to access a prediction, and the second is the prediction string. The user won’t have access to this table—the administrator will control the prediction strings. Figure 7.17 This prediction string table is under the administrator’s control.
Designing the Server-Side Component The server-side component will accept input from the user, then use that input to construct an output string. The output string will also contain data taken from another table. Users commonly rely on data entry applications to gain access to information they want for a particular purpose. Filling out a survey or user identification form might garner access to a private Web site. Listing 7.5 shows the source code for this example.
Listing 7.5
Data Entry Component Source Code
Public Function GetPrediction(ByVal ByVal ByVal ByVal ByVal ByVal
First As String, _ Last As String, _ Age As Integer, _ Sex As String, _ Likes As Integer, _ Dislikes As String) As String
Ch
7
224
Chapter 7
Creating Data Entry Forms and Surveys
Listing 7.5 Dim Dim Dim Dim Dim
Continued
Conn As ADODB.Connection Recc As ADODB.Recordset ConnStr As String Output As String RecCount As Integer
‘Database connection. ‘Recordset ‘Database connection string. ‘Prediction string. ‘Total number of prediction records.
‘Create a connection string. ConnStr = “Provider=sqloledb;” & _ “Data Source=WinServer;” & _ “Initial Catalog=DataEntry;” & _ “User Id=sa;Password=; “ ‘Connect to the database. Set Conn = New ADODB.Connection Conn.Open ConnStr ‘Obtain the user input recordset. Set Recc = New ADODB.Recordset Recc.Open “UserInput”, _ Conn, _ adOpenStatic, _ adLockOptimistic, _ adCmdTable ‘Enter the data. Recc.AddNew Recc!First_Name = First Recc!Last_Name = Last Recc!Age = Age Recc!Sex = Sex Recc!Likes = Likes Recc!Dislikes = Dislikes Recc.Update ‘Create the output string. Output = “Hello “ & First & “ “ & Last If Age > 0 And Age < 40 Then Output = Output & “. You’re a young “ ElseIf Age >= 40 And Age < 65 Then Output = Output & “. You’re a middle aged “ ElseIf Age >= 65 Then Output = Output & “. You’re an old “ Else Output = Output & “. You’re a mysterious “ End If If UCase(Sex) = “M” Output = Output ElseIf UCase(Sex) = Output = Output Else Output = Output End If
Then & “man.” “F” Then & “woman.” & “person of unknown gender.”
‘Close the user input recordset. Recc.Close
Creating a Simple Data Entry Form Application
Listing 7.5
225
Continued
Set Recc = Nothing ‘Open the prediction recordet. Set Recc = New ADODB.Recordset Recc.Open “Prediction”, _ Conn, _ adOpenStatic, _ adLockOptimistic, _ adCmdTable ‘Determine the number of records. Recc.MoveLast RecCount = Recc!PNumber ‘Find a random location in the prediction database. Randomize RecCount = Int((RecCount * Rnd) + 1) Recc.MoveFirst Recc.Find (“PNumber = “ & RecCount) ‘Finish creating the output string. Output = Output & vbCrLf & “Your prediction is: “ & Recc!PString GetPrediction = Output ‘Close the recordset and database. Recc.Close Conn.Close ‘Clear the variables. Set Recc = Nothing Set Conn = Nothing End Function
As you can see, this component begins by creating a connection to the database. We require access to two tables, but not at the same time. This allows use of the same variable for both accesses, as shown in the source code. Notice that in the first case, the application adds a record to the database without performing further analysis. In the second case, the component queries a random value from the database without adding anything new. The remainder of the code in this component performs analysis on the input. Data entry applications often perform this type of service (of course the analysis is a little more complex than the code shown here). In this case, the user obtains feedback on the input.
Creating a Data Entry Client The data entry client application is responsible for gathering data from the user, ensuring the data is of the right type, and sending it along to the server. It also presents the output to the user. In this case, the output is a simple message box. Listing 7.6 shows the code for this example.
Ch
7
226
Chapter 7
Creating Data Entry Forms and Surveys
Listing 7.6
Data Entry Client Source Code
Private Sub cmdPredict_Click() Dim SendAge As Integer ‘Create the SOAP client. Dim MyPrediction As SoapClient Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set MyPrediction = New SoapClient MyPrediction.mssoapinit _ “http://winserver/soapexamples/DataEntry/DataEntry.WSDL”, _ “DataEntry”, _ “PredictionSoapPort” If Len(txtAge.Text) = 0 Or Not IsNumeric(txtAge.Text) Then SendAge = 0 Else SendAge = txtAge.Text End If MsgBox MyPrediction.GetPrediction(txtFirst.Text, _ txtLast.Text, _ SendAge, _ comboSex.Text, _ comboLikes.ListIndex, _ comboDislikes.ListIndex) ‘We’re finished, so exit. Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + MyPrediction.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + MyPrediction.faultstring + vbCrLf + vbCrLf + _ “Fault Actor: “ + MyPrediction.faultactor + vbCrLf + vbCrLf + _ “Detail: “ + MyPrediction.detail MsgBox ErrorMessage, vbExclamation End Sub Private Sub cmdQuit_Click() ‘Exit the application. End End Sub Private Sub Form_Load() comboLikes.ListIndex = 0 comboDislikes.ListIndex = 0 End Sub
Security, Privacy, Performance, and Reliability Issues
227
Notice that we don’t provide default values for either age or sex. In addition, you’ll notice that the component code in Listing 7.5 handles these two fields with some level of discretion. Finally, the database design allows for NULL values in these two fields. All of these coding practices reflect the need to protect the user’s privacy. It’s a consideration for all SOAP applications because they provide some type of online service that might appear in the public eye. The code does check for problems in the age field. It’ll provide a value if the user doesn’t enter an age—one that the application will associate with a NULL value. Likewise, the application assumes the user didn’t want to add a value if he supplies a string rather than a number in the age field. None of the other fields requires a validity check. The application must assume that the user knows his name. The sex, likes, and dislikes fields all use combo boxes, which allow us to keep values in check without adding code. Notice that the code automatically sets the likes and dislikes values to default values, but leaves the sex field alone so isn’t any assumption of gender.
Testing the Data Entry Application It’s time to test the application. You’ll need to register the component on the server and create the usual WSDL file. Figure 7.18 shows some sample output. Notice that the string contains a mix of user input and data from the prediction database. Figure 7.18 The output of the Data Entry application is a restatement of the input data and a prediction.
It’s important to include a range of input values during the testing process. Try arcane combinations of input. For example, this application doesn’t test for blank fields, something that could cause problems. It should also check for duplicate entries in the database, but SQL Server does take care of this problem.
Security, Privacy, Performance, and Reliability Issues It might seem a bit strange to place the four topics in the heading for this section together, but you really do need to consider them as a group. Changing one item normally affects the other three in a negative way. For example, when you increase privacy, you decrease what you know about those visiting your Web site. As a result, your ability to search for potential security flaws decreases. Enhancing privacy also means using more system resources, so performance takes a hit. Finally, adding more processing to your system increases the number of failure points, which decreases reliability. In short, you can’t make one change without changing your entire setup.
Ch
7
228
Chapter 7
Creating Data Entry Forms and Surveys
Interactions are nothing new for developers. Most of us have had to deal with code interactions as part of every programming exercise. The code you create in one area invariably produces an effect somewhere else—sometimes in a way that you never anticipated. However, SOAP represents a new set of interactions to consider because the conditions are different. SOAP operates with the Internet, making it possible that outside sources will affect your system. In addition, SOAP features new functionality that you haven’t had in the past. The following sections will look at security, privacy, performance, and reliability from a SOAP perspective. We’ll consider how this new environment will affect how you design and write an application. These sections will also discuss a difference in ideology—the way you approach a development problem.
Security One of the most important features for any application today is security. Read some of the trade press magazines and you’ll discover the latest cracker attack on a large company. Some of these attacks are so severe that they attract attention in large national magazines and newspapers. In short, the problem is real and it’s significant. As a developer, you’re right in the middle of the current problems with security. In days past, a developer could rely on the operating system and third-party products to take care of security problems. Local server settings are often enough to ensure safe computing in a LAN environment. These sources still help, but the developer often has to do more to ensure the application remains safe. The question, of course, is how safe is safe enough? Every time you add a new security feature, you further obscure the application from public view. However, security features consume resources that affect the performance of your application. In addition, each new security feature is a failure point, so adding a security feature reduces the reliability of your application. In short, you have to reach a balance between enough security and adequate performance and reliability. Adding security features can affect privacy. As you add features, the user normally has to provide more information for identification purposes. After all, one of the main goals of security is to ensure that you know who is at the other end of the wire. This is the negative impact of privacy. An anonymous user can’t access your site, so on some level he has to give up some privacy to use a resource on your server. Fortunately, security also has a positive impact on privacy. For one thing, the user worries less about someone else gaining access to his information. By reducing the chances of a cracker breaking into your server, you also increase the possibility that no one will find out about your user base. Of course, everyone should know that there aren’t any completely secure systems. If a cracker has enough time, he will eventually break into your system. Security increases the time required for a break-in and allows you to detect a break-in with greater ease, but it won’t prevent a break-in from happening. As part of your security strategy, you need to consider how much management the system will require and use that as the determining factor for the security, performance, and reliability trade-offs.
Security, Privacy, Performance, and Reliability Issues
229
From a SOAP perspective, adding security to an application is difficult. SOAP provides no built-in security, and using external security is a halfway measure at best. You also have to consider the severe impact that adding layers of security will have on your application. SOAP already bloats the size of data by adding XML tags to it. These tags are necessary in the absence of the information that binary data transfer provides. Encrypting the data increases its size even more. In short, you need to use larger numbers when predicting the impact of security on the performance and reliability of your application. How large of an impact could using security and SOAP together have on your data? Some people report that their data increases by as much as 600% when converting it to SOAP, and using SSL increases the size another 150%. This is probably a worst-case scenario, and few real-world statistics exist at this moment. However, given these numbers, a 1,000-byte data transfer suddenly consumes 9,000-bytes when transferred using both SOAP and SSL. (These figures are based on several sources, including vendors and newsgroup messages— there aren’t any formal tests available at the time of this writing.)
Privacy Every country in the world is becoming serious about the privacy of individuals. The reason is simple. Not only does protecting individual security make good political sense, but people do have a right to privacy. From a corporate goodwill perspective, it’s essential to protect customer data. You won’t maintain a customer base for long if your company gains a reputation for giving out customer information at the drop of a hat, especially in countries where privacy policies tend toward strict confidentiality. Every time you collect survey information from an unknown party, you risk divulging information about that individual that could have unforeseen repercussions. Given that privacy is such an important issue, you don’t want to risk giving out information, even if it’s the result of a cracker attack on your system. In short, the best way to conduct a survey is to gather the information anonymously. However, collecting information in this way almost guarantees that someone will try to contaminate the survey by stuffing the ballot box (taking the survey more than one time).
One favorite method of overcoming the problem of identification is to ask for the user’s e-mail address and allow just one entry per e-mail address. This is the least invasive way to assure the survey is unique. If a cracker does break into your system, he’ll receive an e-mail address that’s somewhat public and not any personal information about the respondent. The e-mail address also allows you to perform spot checks and ensure that the person actually participated in the survey. You don’t have to know a name, address, or other information to use this technique, so it’s safe from a privacy perspective. Ch
7
230
Chapter 7
Creating Data Entry Forms and Surveys
Privacy issues are difficult on other fronts as well. The less information you collect from someone before allowing access to your system, the higher the chance that he’ll use the opportunity to circumvent your security. Increasing privacy tends to affect security in a negative manner because security principles dictate knowing as much as possible about the person you’re dealing with over any connection. Adding privacy features to your application also increases processing load. Any additional processing uses resources that you could use for other purposes. Consequently, increasing privacy normally incurs a performance loss. Fortunately, this isn’t always the case. The amount of performance loss depends on how you implement privacy. Unlike security, which you want to automate component, you can implement privacy through policies and other human activity changes. For example, you could create a company policy that requires personal telephone contact for anything other than an anonymous survey. Your application could actually see a performance increase if company policy dictates purging all personal information at regular intervals. Removing individual surveys and preserving just the raw tallies after a longer storage period could allow further performance increase because the databases holding the data contain fewer records. Privacy usually affects reliability in a negative way, even if you’re implementing privacy using policies. No matter which way you look at it, adding privacy features to an application means adding more failure points. In fact, using policies actually decreases reliability even more than adding software components because policies require interpretation on the part of the implementer.
Performance Many developers are finding that the myth of unlimited bandwidth and resources is just that—a myth. As soon as a company adds more bandwidth to a network, management and users find new ways to use that bandwidth. A network that feels comfortable for a while after an upgrade reverts to providing slow access in a short time because of increased bandwidth usage. Likewise, a larger hard drive invites users to store more data, rather than archive it. Any increase in storage will almost certainly disappear. In short, performance is always going to be a problem that developers need to solve. One of the dichotomies that developers must consider is that society will continue to demand new application features while users complain about the problems of code bloat. For example, we’ve discussed increased security and privacy features in this chapter. Users don’t dictate these requirements; corporate management and government demand that developers add them. The developer is caught in the middle trying to decide how best to implement required features in such a way that performance isn’t affected, but application requirements are satisfied. SOAP applications suffer from code bloat more than desktop applications because they require extra security and privacy. An open application requires more code to ensure a safe user session. Unfortunately, SOAP applications also have fewer resources because Internet connections aren’t as fast as the LAN in most cases. As you can see, if you spend a little time tuning performance on a desktop application, you’ll spend a lot of time tuning SOAP application performance.
Security, Privacy, Performance, and Reliability Issues
231
(There are exceptions to every rule—see the “When SOAP Is Faster” section of Chapter 1, “An Overview of SOAP,” for some ideas on when SOAP actually provides better performance than a desktop application.) The best way to view this situation is a problem of trade-offs. If you want maximum performance from your application, you have to be willing to design your application carefully. SOAP can help in this regard if your remember some simple SOAP rules. For example, SOAP is a one-way transfer protocol. Sure, you can use a request/reply format to your messages, but there isn’t a prolonged session to consider. This means that you can add security to just the data transfers that contain sensitive information. Likewise, establish user identity after using security e-mail or some other means. Provide the user with a keyword or other unique means of identification he can use. You can ensure privacy from prying cracker eyes by using code and keeping user records on a server that isn’t directly accessible from the Internet connection. Using this technique means the impact on performance will remain small while keeping user privacy high.
Many developers are used to using long function and variable names. Use short names whenever possible for public functions and variables. This decreases the size of the SOAP tags and enhances application performance.
Reliability Application reliability is an ever-increasing concern as companies begin competing in a world market. The term, “24/7” is as much a part of our vocabulary now as Internet. Most corporations interested in SOAP application development have a requirement for applications that run 24 hours a day, 7 days a week without failure. Hardware engineers have gone to lengths to ensure that a hardware failure means a loss of performance, not down time. Many vendors now strive for five 9s (99.999%) of up time. As mentioned throughout this section, any time you add a new module to your application, you reduce reliability. That’s because reliability prediction bases the reliability of your system on the number of failure points (application modules) as well as the reliability of a particular module. However, many of the modules in your application aren’t under your direct control. For example, you’ll likely use another vendor’s XML parser instead of one created by your inhouse staff. Likewise, we’ve used SOAP toolkits created by other companies throughout the book. It makes sense from a developer productivity perspective to use third-party products. In addition to third-party application modules, you also have the activities of others to consider. The Internet connection you use is under the ISP’s control. Crackers can probe messages as they move between locations on the Internet. Even nature gets into the act by adding noise to the line and interrupting the signal.
Ch
7
232
Chapter 7
Creating Data Entry Forms and Surveys
Obviously, you have no control over conditions outside the application or modules you didn’t design, but you can control application elements you did design. The following list provides some ideas for increasing the reliability of SOAP applications: ■
Keep the number of modules in your application small. Use modules as needed to provide performance, privacy, and security benefits, but try combining modules whenever possible.
■
Investigate compatibility issues between platforms before you begin the design process, then compensate within your application code.
■
Test modules individually and integrated into the application. Modules that you test fully will cause fewer reliability problems.
■
Look for places where you can reduce the number of processing steps. This increases both performance and reliability. Of course, you need to consider the nature of SOAP applications and add both security and privacy features as needed.
■
Use standards-based programming whenever possible. This ensures the remote site will understand the messages you send. This is especially important for security and will probably become important for privacy issues as well.
Handling Data Entry and Survey Form Errors Data entry and survey form applications have one thing in common: the end user is unlikely to know as much as the average desktop user about error messages. In addition, these users aren’t in a position to do much about the error even if they do know enough to understand what it means. In short, you need a new way to handle application errors. Providing a means to record errors on the server for the component isn’t hard. All you need to do is trap the error and record the event in a log. When working with a survey application, you use the event log in place of a message to the user. Try to record the incoming data as part of the event message so that you preserve the survey if possible. This allows someone to enter the survey again later after the component problem is resolved. In previous chapters, we used the SOAP Fault message to handle error messages. However, consider the user’s reaction if he sees a message that states the server isn’t responding or has experienced some type of processing difficulty. The user might be at a kiosk and won’t care about the server. All the user knows is that the application isn’t providing the required information or allowing him to complete an action. Considering the fact that you’re asking the user for a service (answering survey questions, for example), he might not care and give up after the first problem. The inability to send a message back to the user of a survey application or the need to send something comprehensible to the user of a data entry application presents challenges. If you’re using the Microsoft SOAP Toolkit, it means that you have to use the ASP method of working with the SOAP message because the ISAPI method lacks the means to capture the fault message. Edit the ASP file by hand to create an event log entry for the administrator, rather than send the fault message back to the client. When working with a data entry
Using Templates for Quick Forms
233
application, you’ll need to come up with a simple message that everyone will understand, but that doesn’t necessarily provide complete error information. We’ve covered the Web server and the component. What happens if the error occurs between the client and the Web server? For example, the application might receive a timeout message if the server takes too long to respond. Depending on the SOAP toolkit you use, it might be possible to add scripts to handle some of the common problems such as time-outs. Unfortunately, recording an event log entry is going to be difficult or impossible. In this case, you might want to provide a telephone number as part of the user message. This will allow the user to call someone to notify them of the problem without burdening the user with information he doesn’t want to know.
Using Templates for Quick Forms Ultimately, the need for a user interface ties SOAP to desktop or browser applications somewhere along the way. The need to present the user with an interface means writing code that displays labels and other visual cues that humans need to make sense of computer data. You might find that you’re producing forms that vary only slightly because of the kind of data you’re sharing or the way that your application works. Surveys and data entry applications are well suited to use templates. A template describes those common elements and allows the application to describe the rest. Consider templates as the latest form of “fill-in-the-blank” technology. Templates aren’t new. Web servers use templates regularly. One of the best reasons to use frames is to allow a developer to place common elements in set locations and other windows for changing data. Many Web servers even support the concept of headers and footers to display common data without resorting to frames. The addition of technologies such as cascading stylesheets (CSS) is the result of developer reuse requirements. SOAP differs from many Web applications in that it needs these common elements on the user’s machine because it uses client-side processing. In other words, creating an ASP that creates a Web page based on user input probably won’t do the job unless you’re using the ASP to create the initial page. The application should also include client-side features that rely on templates so the application transfers the least amount of information to and from the server. Remember that a SOAP application is client-based, not server-based. The user isn’t going to download the SOAP application every time as he would a Web page. Desktop applications present a different challenge. Most desktop applications rely on predefined forms and never use any form of template. You can overcome this problem in a number of ways. For example, programming language products, such as Visual C++, allow you to combine HTML with standard desktop applications. In short, you could use HTML templates in the same way that you would with a Web-based application. The bottom line is that templates can save a programmer time and create a more consistent look for your application. Implementing template technology on the client isn’t difficult, but it will require you to rethink some of the ways you’ve worked with templates in the past.
Ch
7
234
Chapter 7
Creating Data Entry Forms and Surveys
Using MIME for SOAP Applications As SOAP applications become more complex and data handling needs increase, we’ll eventually need ways to transfer data that doesn’t meet the XML specification requirements. The real world contains data types that cause developers to pull their hair out by the handful. Transferring data from older databases on mainframes could prove especially difficult. In short, the data types that the XML specification defines are fine as a start, but hardly the end of the discussion. One of the big problems that SOAP vendors face right now is the issue of working with unwieldy data types. SOAP is essentially text, which limits the number of ways to translate data. The answer that everyone is using right now is Base64 encoding. Of course, trying to implement this as part of a SOAP message is difficult for even the best programmer. The Multipurpose Internet Mail Extensions (MIME) attachments described in the “Understanding SOAP Attachments” section of Chapter 2, “SOAP in Theory,” promise an easier way to handle these data types. Using MIME means that you can separate the SOAP message from the peculiar data you ask it to deliver to the client. Images no longer pose a problem because they appear in a separate part of the message.
You currently have to use attachments or Base64 encoding to transfer images when working with SOAP. The reason is simple: XML doesn’t support an image data type, and SOAP relies on XML standards support. A future version of SOAP might include a native image type. The Association for Information and Image Management International (AIIM) (http://www.aiim.org) is working on a new image standard for XML. The International Organization for Standardization (ISO)(http://www.iso.ch/) is supporting the effort as part of TC171/SC2/WG2. AIIM also has plans to work closely with the World Wide Web Consortium (W3C) (http://www.w3.org/). It’s helpful to keep track of standards such as this one because they greatly reduce the effort required to create database applications.
My original plan for this chapter included showing how to use MIME to add pizzazz to a data entry application. Imagine being able to send photographs or other hard-to-handle data across the Internet using SOAP. Unfortunately, no one has a SOAP toolkit that will create a MIME message at the time of this writing. The specification defines the techniques for working with MIME, but the required support is sadly lacking. One vendor went so far as to say that MIME support would have to wait until enough user comments showed a need. I did learn a few things about using MIME with SOAP during the course of planning this chapter. For example, some developers have decided not to wait and are writing their own SOAP implementations that support MIME. Working with these talented individuals has allowed me to draw the conclusions outlined in the sections that follow.
Special Considerations for MIME Messages Implementing MIME for SOAP will require two levels of changes. The first is to the SOAP toolkit that you use to create applications. SOAP toolkit vendors will find it challenging to add MIME support because MIME relies on helper applications to do its job. When you
Using MIME for SOAP Applications
235
work with an e-mail or browser application, you can define these helpers. Using MIME seems automatic because the application list is readily available. However, SOAP doesn’t provide a user interface as such, so where do you configure the helper application list? Using the client’s list of helpers will work on the client machine, but not on the server. Another problem that SOAP toolkit vendors will have to solve is the issue of attachment compatibility. For example, it’s relatively easy to say that users will require some type of image, sound, and multimedia support, but vendors will have to figure out which file types SOAP should support. The image files on Macintosh are different than those on a PC (and they’re stored differently as well). MIME support will also require extensions to the XML parser on most machines. The vendors will have to teach the parser the difference between the SOAP and the attachment portion of the message. The parser might also face difficulties in parsing SOAP messages that refer to attachments so the user can interact with them. The technology is already available in part, but the implementation lags behind progress in other areas of SOAP. The second level of change is developer use of SOAP toolkits. It’s unlikely that any MIME implementation will rely on the high-level API that most developers prefer. This means that developers will spend more time building messages by hand. Automation has certain advantages. It’s important to plan for delivery delays due to the increased time required to use the low-level API. Developers will also spend more time considering cross-platform compatibility issues. Even if the SOAP toolkit vendor provides a great parser and clearly defined policies for attachment use, it’s up to the developer to resolve the differences between the SOAP toolkit capability and real-world requirements. For example, the developer will need to provide a means for placing an image found in an attachment within a SQL database.
Where MIME Support Is Going You can’t use MIME today, and given the number of issues vendors must resolve before developers can start using MIME, you probably won’t see MIME support for a while. Some vendors see MIME as an unnecessary add-on, while others want to work on other SOAP requirements first. Eventually, you’ll be able to use MIME to transfer some types of data using SOAP. The technology already appears in other parts of the Internet and it offers too many advantages to ignore.
The W3C has created a specification for working with SOAP attachments (http://www. w3.org/TR/SOAP-attachments). This information is in addition to the material that we discussed in the “Understanding SOAP Attachments” section of Chapter 2. Despite the number of documents that discuss SOAP attachments, no one has implemented them fully yet. Several vendors are working on including SOAP attachments at the time of writing, so you might find a toolkit that includes the required support.
The short answer is that MIME will have to wait until other parts of SOAP are better defined, vendors have finally solved some of those compatibility problems, and pressing
Ch
7
236
Chapter 7
Creating Data Entry Forms and Surveys
issues such as security are resolved. SOAP is a new technology and it will require time to define fully. MIME will also have to evolve. The current implementation of MIME defines multi-part message delivery for applications such as e-mail and browsers. Vendors rely on the user to fill in many of the gaps required to make MIME work. They won’t have that luxury when working with SOAP. It may be that we’ll eventually end up with something like MIME, only better defined and explicit for use with SOAP. Only time will tell.
Project The best way to learn how to create surveys and data entry forms is to create a few projects of your own. Creating a survey is much easier than a data entry form when working with SOAP because you only need to consider one-way data transfers. Surveys come in a wide range of forms, providing you with a chance to experiment at several levels. Use the following steps to create several survey SOAP applications of your own. 1. Design a database. Make sure you consider privacy and security issues as part of the design. In many cases, making the survey anonymous works just fine. Try using an e-mail address in situations where you need to identify the user in the least invasive way possible. 2. Create a server-side survey component. Make sure this component provides only oneway access. It should only allow the user to pass survey information to the server; the server shouldn’t send any information back. Adding a success message to the client application tends to reduce the need for server communication. It’s also important to create a special account for survey access. Use a different account for each survey to ensure maximum protection. 3. Create the survey form. It often helps to build a local client first, then use the code you create as the basis for a remote client. Ensure you provide full error message support because the user will likely have little computer experience. 4. Add a minimum of nine records to your database using the survey application. This allows you to test the application and build records for analysis. Make sure your database contains enough entry types that you can test application functionality fully. 5. Create a server-side analysis component. Make sure this component provides read-only access to the database. This ensures that data integrity remains high and reduces the chance that the client will accidentally corrupt the database. Create a special account with read-only access for this survey analysis. In some cases, you might have to provide supervisor access using a separate component and account for managers at the client. Adding filtering and indexing capabilities for the analysis component is one way to improve performance because the client will only download needed data. 6. Design a survey analysis form. In fact, you might need to design several forms to meet client needs. In addition to raw data, provide outputs showing interpreted data. For example, you might want to provide percentage for each raw data output. Many analysis programs also provide graphics, such as pie charts, to ensure the client receives maximum benefit from the data.
CHAPTER
Providing Remote Database Access In this chapter Remote Database Application Uses and Concerns
239
Developer Shortcuts for Remote Database Applications Using Complex Data Types
245
Defining the SQL Server Database Creating the Server-Side Component
255 255
Creating a Middle-Tier Component
263
Creating the Client-Side Application
267
Testing the Application
272
Quick Fixes for Remote Database Applications Addressing Transaction Issues Troubleshooting
279
278
274
243
8
238
Chapter 8
Providing Remote Database Access
Databases form the core of most, if not all, businesses. Everyone needs to store data for future use. The data a business owns defines it and enables it to compete. Businesses also sell data and share it with partners. Customers rely on the data a business owns in the form of catalogs to make purchases. In short, it’s natural that you’d want to provide some form for remote database access using SOAP. Using SOAP will allow a business to exchange data with more businesses than ever before in more ways than anyone would have thought possible even a few years ago. This chapter is going to help you understand the SOAP-to-database connection. You’ll find that SOAP brings with it special requirements for database access. Directly accessing data in a database with SOAP is unwise, and we’ll explore the reasons why in the first section of this chapter, “Remote Database Application Uses and Concerns.” This section will also explore some of the better uses for databases with SOAP and some of the uses you should avoid. The second section of this chapter, “Developer Shortcuts for Remote Database Applications” will help you get your applications finished faster. We’ll look at techniques you can use to avoid development pitfalls when developing SOAP applications. For example, SOAP normally requires additional “buffer” components that you might not use in most situations. The third section of this chapter, “Using Complex Data Types,” contains a special example showing how to use complex data types. The Microsoft SOAP Toolkit doesn’t provide the best support for this type of example, so we’ll use pocketSOAP instead. This example does point out the need for innovation when working with SOAP in its current state because not every toolkit supports every requirement. You’ll find that compatibility requirements become even more critical as the complexity of the application increases. The fourth (“Defining the SQL Server Database”), fifth (“Creating the Server-Side Component”), sixth (“Creating a Middle-Tier Component”), and seventh (“Creating the Client-Side Application”) sections show you how to create a SOAP database application. You’ll immediately see that some development tasks are the same as any other environment. For example, the construction of a database doesn’t change simply because you’re using SOAP. In other cases, such as the middle-tier component, you’ll see how to work within the limits of SOAP. After you have the application put together, it’s time to test it. The eighth section of this chapter, “Testing the Application,” will tell you how to test SOAP applications effectively. It’s important to remember that you have less control over a SOAP application in some ways because the client and server aren’t directly connected. You also need to test more failure conditions, such as the loss of a connection. The ninth section of this chapter, “Quick Fixes for Remote Database Applications,” tells you how to repair your database application when it fails. Many developers are under the impression that software doesn’t fail. In the strictest sense of the word, they’re correct because software doesn’t wear out. However, applications can fail in many other ways, and this section tells you about the problems that you’re most likely to see. We’ll look at database applications in general and SOAP applications in particular.
Remote Database Application Uses and Concerns
I mentioned earlier that data defines your business. This means that data loss can cripple your business in ways that you might not know about until the loss occurs. One solution to the problem is to ensure that data loss doesn’t occur, or at least becomes a rare event. Most database applications today rely on transactions to ensure that data transfers occur safely. Unfortunately, SOAP doesn’t provide a native means for handling transactions. We’ll talk about this issue in the “Addressing Transaction Issues” section of this chapter. The final section of this chapter, “Troubleshooting,” tells you how to troubleshoot SOAP database applications based on everything you’ve learned in the chapter. We’ll go through some problem scenarios that you’re likely to see so that you learn to recognize the source of failures immediately.
Remote Database Application Uses and Concerns Database management forms the core of most businesses. Each business-specific activity revolves around some form of data storage and retrieval. Some developers work on database management systems exclusively because the demand for these applications is so great. Databases come in every size and complexity. It’s little wonder then that the ability of SOAP to handle database management tasks is high on the list of developer concerns.
Database applications produce copious amounts of data. SOAP (actually XML in general) increases the size of that data by wrapping it in tags. It doesn’t take much to figure out that a SOAP application could overwhelm your server’s storage capacity in short order unless you’re very careful about the methods used to store the information. One way to reduce the effect of this problem is to compress the data. Unfortunately, standard compression techniques leave the document object model (DOM) unreadable, which makes the data inaccessible. In short, standard compression reduces the functionality of SOAP. Fortunately, there’s a solution to the problem in the form of XMLZip (http://www.xmls.com/ resources/xmlzip.xml?id=resources_xmlzip). You can get XMLZip for the price of a download. This product compresses your SOAP data and still allows applications to access it as normal. You can get XMLZip for the Windows NT/2000, Solaris, and Linux platforms. We’ll discuss this product more in Appendix C, “Third-Party Tool Reference.”
This section of the chapter isn’t going to tell you that SOAP is the greatest protocol in the world for handling database application needs. In fact, SOAP is only useful when creating certain classes of database applications. The first section that follows will tell you why SOAP is less than perfect for database application needs. The second section will tell you where SOAP excels, and which applications can use SOAP successfully. The bottom line is that there are some situations where online processing of your data won’t work because the technology required to implement a solution doesn’t exist. SOAP is simply a first step into the larger application development world.
239
Ch
8
240
Chapter 8
Providing Remote Database Access
Concerns Let me say up front that the use of SOAP for any kind of database application concerns many developers. The problem, according to purists, is that SOAP brings with it too many limitations to provide effective database support. About half of the people I talked with agreed that five issues hinder the use of SOAP for database applications. ■
Security: SOAP doesn’t provide native security. Adding security separately isn’t good enough, according to some experts, because the use of two protocol levels leaves the data exposed for some period of time (albeit a small one). In most cases, the security provided by protocols such as Secure Sockets Layer (SSL) will work fine for non-critical, but sensitive, data transfers.
■
Transactions: You won’t find transactional support in SOAP, and many developers feel that it’s going to be impossible to add it. (An effort is being made now to add transactions, but as of this writing, the committee studying the problems hasn’t made much progress.) This is a significant problem, and one that should probably keep you from using SOAP for transferring critical data. However, not all data transmissions on a LAN require transactions, and it’s certain that not all data transmissions on the Internet will require them either. The point is to choose the kinds of SOAP applications you create carefully.
■
Connectivity: Unlike your LAN, a SOAP connection can become broken at any time. Because SOAP uses Hypertext Transfer Protocol (HTTP) or Simple Mail Transfer Protocol (SMTP) as a transport protocol, you’ll find that the server doesn’t notice the connection break immediately and that data loss can occur. SOAP doesn’t provide the robust error handling of protocols such as Distributed Component Object Model (DCOM) and Common Object Request Broker Architecture (CORBA). The use of a buffer component such as the one shown in this chapter can help overcome this particular problem.
■
Data Type Translation: As of this writing, the SOAP specification contains little in the way of support for complex data types. The specification also lacks information on translating complex types to the simple types that the specification does support. Tool vendors are currently trying to address this issue, but it seems that each tool vendor has a different vision for data translation. Ensure that the SOAP toolkits you use for your applications support the same data types and data translation schemes, and you can probably avoid this problem.
■
Scalability: Database applications need to support thousands of users in many cases. SOAP detractors assume that all of these users will rely on a SOAP connection to work with data, and that the limitations of this protocol will keep the application from scaling well. The verbose nature of the data exchange is also a source of concern because many servers have a hard time keeping up with binary protocols they don’t need to translate. The reality is that scalability could become a problem if you attempt to use SOAP in more ways than the protocol will actually support. The best approach is to use SOAP for remote communication as needed and to continue to rely on binary protocols for local and secure remote communication.
These five issues would keep some developers from using SOAP for database applications at all. You have to wonder why they’d even invest the time learning SOAP because it’s no
Remote Database Application Uses and Concerns
241
longer available for use with one of the major programming tasks that every business performs. However, despite the limitations you’ll run into, SOAP is still useful for database application programming. Don’t get me wrong. These issues are real and you need to consider them. For example, I’d never recommend sending mission-critical data across the Internet using SOAP because the data might get lost in transit. Future specifications might eliminate this problem by using transactions, but for now, you need to consider the problem as part of your development efforts. There are however, thousands of non-mission critical database uses for SOAP. Consider the lowly order entry system. Certainly, it’s no secret that your company accepts orders for merchandise displayed in a catalog available to anyone. There isn’t any secret data to lose. Crackers won’t compromise the order data if they do manage to get a peek. You can overcome some of the obstacles of potential cracker activity by using an e-mail response system that verifies order data with the client. Such systems are already in place and work well.
For the purposes of this book, the term cracker will always refer to an individual who breaks into a system on an unauthorized basis. This includes any form of illegal activity on the system. On the other hand, a hacker will refer to someone who performs lowlevel system activities, including testing system security. In some cases, you need to employ the services of a good hacker to test the security measures you have in place, or suffer the consequences of a break-in. This book will use the term hacker to refer to someone who performs these legal forms of service.
Of course, you do have to consider the sensitive data in an order entry system. You don’t want to send the customer’s credit card number across the wire without protection or keep it on your system for long. A customer might want to send credit card information using another secure technique and simply reference that credit card using a code word on the order entry page. We’ll investigate the issue of Web-based application security in the “Security Issues for Web-Based Applications” section of Chapter 9, “Moving to Web-Based Applications.”
I realize that online purchasing systems already send credit card information across the Internet using SSL. These companies store the data on a local server and allow you to reference it from your machine. However, recent trade press stories suggest that it isn’t a good idea to send this information across the wire, nor is it practical to store it locally. Crackers have already invaded too many Web sites and stolen credit card information. It’s safer, and more practical, to obtain this information using some secure transfer method and store it only on a secure server behind a firewall for short timeframes. The network administrator should purge old credit card information from inactive clients on a regular basis.
Another important issue that many developers face when using SOAP for database management is recordset size. Although it’s possible to transfer a moderate number of average-sized records, you wouldn’t want to use SOAP to transfer a large number of records in a single
Ch
8
242
Chapter 8
Providing Remote Database Access
call. For example, one developer reported that he was able to transfer 300 records of about 1,200 bytes per record successfully. However, when he tried to move 1,000 records, SOAP reported a timeout error. In addition, performance suffered when he was able to move a large number of records. This problem doesn’t always occur—many developers write database applications that move significant amounts of data in a single call—but it does happen often enough that you need to consider it as part of your application design. The main problem with transferring large numbers of records is that SOAP doesn’t provide a direct representation of a record; it sends the data in an array of structures. The timeout occurs when SOAP doesn’t receive all of the data in a predetermined time interval. (Some SOAP toolkits allow you to set the interval using a special argument.) Variables such as line noise make it difficult to predict how long a large download will take. SOAP toolkits also process arrays differently, which means you’ll see different results by changing your toolkit. For example, some developers reported better results using SOAP::Lite during the time of this writing. (You can see SOAP::Lite messages related to databases at http://groups. yahoo.com/group/soaplite/message/236.) Using SOAP:Lite allowed the developer to transfer 11,000 records in as little as 25 seconds (which is still slower than a LAN). The toolkit you choose will affect data transfer statistics. No matter which toolkit you use, however, you’ll eventually run into time constraints. Processing records using an Internet connection isn’t the same as using a LAN. The best solution to the problem is to use server-side processing to restrict the number of records you transfer to the client. Sending only the records you need to the client helps overcome some of SOAP’s limitations.
Common Uses Based on SOAP Limitations SOAP isn’t the end-all solution for database management. It has limitations that prevent it from becoming the protocol of choice for heavy-duty applications. However, many classes of light-duty applications work just fine with SOAP. The following list highlights some of the important application types where SOAP works well despite limitations. ■
Surveys: Many companies rely on one-way data transfer to collect information from customers. You can use SOAP to transfer data efficiently from a client site to the server and then call upon a server-side component to process the data. Some detractors state that HTML forms have been around for a long time and seem to fulfill every need. However, companies are constantly creating new survey types, and SOAP represents one way to add intelligence to the forms.
■
Forms: Companies run on forms. The fact that companies don’t make them out of paper any more doesn’t change much. If an employee requires more supplies, he will need to request them using some type of form. Processing these forms using SOAP will reduce manager workload. In some cases, small purchases don’t require manager approval, and the automation that SOAP provides allows the server-side application to determine this automatically. A form application could also return a receipt to the user telling him that the product is ordered. Another receipt might tell when the order actually ships from the warehouse.
Developer Shortcuts for Remote Database Applications
■
Query: Queries represent one of the least potentially damaging forms of two-way communication between client and server. A user makes a request; the server provides the information in some form. SOAP provides the glue that allows the two to communicate. I’ve shown many examples of the query application in this book because users need this kind of application most often. Even if your company doesn’t want to risk data contamination by allowing users to enter data into the database, allowing the users to query the database no matter where they are will still provide benefits in employee performance.
■
Order Entry: If you have a sales force on the road, you know the convenience of remote order entry. Customers are no longer willing to wait while the salesperson makes his rounds to get an order started—they want their order entered today. However, smart order-entry processing was problematic in the past because it normally relied on technologies such as DCOM. SOAP allows you to create order-entry systems that are more reliable (at least when compared to DCOM on the Internet). Of course, you’ll want to create safety procedures that protect user data, such as credit card numbers.
■
Reports: Besides forms, employees normally have a wealth of reports to fill out— everything from their current status to the number of hours they worked in a week. In the past, reports often waited until the employee came back from a trip, creating the usual pile of papers that everyone hates to see. Using SOAP and SMTP together allows an employee to transfer reports today that the server processes as it has time. As with many applications that we’ve discussed in this list, you’ll need to buffer the reports so the application doesn’t have remote access to the database. Unlike many applications, the use of SMTP acts as a buffer, so you don’t need to build this support into your application.
Developer Shortcuts for Remote Database Applications Converting your database application to use SOAP isn’t the same as creating a new application. The amount of work that you put into the conversion depends on the complexity, modularity, and overall workmanship of your application. A modular application usually divides the middle-tier processing from the back end anyway, so everything is already in place for the conversion. Sometimes, you can update the existing components and, in other cases, you’ll create a small communication component to interact with SOAP. Many developers find themselves overwhelmed with the task of updating an application to use SOAP instead of another technology, such as CORBA. The thought is that every small change at the front end of the project will create major changes in the back end or that the project will run into compatibility nightmares from the beginning. Database applications are especially daunting for all of the reasons we’ve discussed so far. You can reduce the time required to update an application to SOAP or even create a new one. The following tips provide a few ideas on how you can make your next project a little easier.
243
Ch
8
244
Chapter 8
Providing Remote Database Access
■
Test a simple case first: A friend recently reminded me about the test harnesses that many manufactures use to check product components. For example, a washing machine vendor might create a test harness to verify that the motor for that machine works as anticipated long before the first washing machine appears on the scene. You can use this same principle for your SOAP applications. Make sure you test a simple case of the communication components before you invest time in creating the final user interface or complex backend components. Data exchange is a major issue when your communication protocol relies on a text-based format.
■
Validate every database data type: Many database managers provide special data types to allow efficient storage of bulky or unusual data. For example, most database managers include currency and date types in native form. A database manager might include a picture or other unusual data type. Of course, SOAP doesn’t provide the functionality required to transfer a picture data type, so you need to translate that data in some way. The goal of validation is to ensure you can either transfer every database data type natively or use a stored procedure to translate it before placing it on the wire. Creating generic routines now will ensure you don’t have to stop in mid-project to create the routines later.
■
Verify the toolkit’s capacity: Database applications tend to stress the capabilities of most SOAP toolkits. You need to verify that the toolkit provides all of the functionality your database application will require. For example, if you plan to transfer large amounts of data with each transaction, you’ll need a toolkit that allows you to set the timeout value. If your database includes a lot of unusual data types that your application will need to use, then you’ll want a toolkit that supports user-defined types (UDTs). See the “Using Complex Data Types” section of the chapter for details.
■
Choose a parsing technique early: The parsing technique you use depends on the kind of data you need to transfer and the required transfer speed. Most XML parsers include the ability to parse messages that rely on the Document Object Model (DOM) or Simple API for XML (SAX). DOM is the World Wide Web Consortium (W3C) recognized method for transferring data using XML and represents the best method to use if compatibility is the major concern. DOM is also the best solution if you need to read and write documents. However, many developers find DOM so complex to use that they created SAX. SAX is more efficient than DOM and will allow you to create faster applications. Unfortunately, SAX is read-only, so it limits the number of tasks you can perform. A third alternative, JDOM, is supposed to provide the speed of SAX and the flexibility of DOM, but it doesn’t have much of a market presence yet. Many SOAP toolkits bury the method for using the parsing technology, so you’ll need to spend time looking at how the documents call on the remote server.
The W3C finally recognized DOM Level 2 in November 2000. This is the standard that adds support for Cascading Style Sheets (CSS), namespace support, a standard method for combining objects, and a standard means to access and manipulate objects. In short, the very basis of everything that SOAP represents is still in a state of flux. The W3C is still considering other DOM changes that could ultimately affect how you work
Using Complex Data Types
245
with SOAP. For example, you can expect DOM to support scalable vector graphics (SVG) and mathematical markup language (MathML) in the future. The vendors involved with DOM are currently working on the Level 3 specification. Ch ■
Address client interoperability concerns: HTML is successful because you can use any browser to view it—at least if the HTML coding follows the established standards. No one has standardized SOAP yet, so there are interoperability problems. However, the SOAP specification assures a certain level of compatibility between platforms. You need to ensure that the toolkit you select will support all of your platforms and those of your clients.
■
Use standard database application development techniques: This book won’t tell you how to create a database application. However, some developers change their tactics every time they see a new technology because they think the old techniques will no longer work. For the most part, developing a SOAP database application is no different than developing one for your LAN. The only difference is the techniques you use to transfer data from one place to another. Instead of instantiating components using DCOM, you’ll use SOAP techniques.
Using Complex Data Types Databases use a large number of data types. In fact, you might find that SOAP doesn’t support all of the data types for your DBMS. This means that you’ll face data conversion issues that could affect the performance of your application to the point that handling thousands of transactions per day becomes difficult. The following sections address the issue of using complex data types within your application. We’ll begin with an overview of the problem with working with complex data types. The next section will show how various WSDL generators handle the problems of working with complex data type. The last two sections show you a complex data type example.
Understanding the Problem You need to consider two issues for data type conversions. The first is whether the unsupported type is a complex type created from several simple types or whether it’s a new type. You can create complex types in SOAP using a special coding technique. See http://www. w3.org/TR/2001/PR-xmlschema-1-20010330/#element-complexType for details on the XML definition for a custom data type. New types require a custom schema definition.
Remember that the basic difference between a simple type and a complex type is that the complex type can carry elements in their content and include attributes. You can define new simple or complex types, so it’s important to use the proper type for the job at hand.
8
246
Chapter 8
Providing Remote Database Access
The issue is one of compatibility. SOAP supports complex types, even if you have to write the code for them manually. (Few of the code generators that I’ve seen provide native support for complex types.) SOAP doesn’t support custom schemas. You can add the schema to your message, but there’s no guarantee that the remote system will understand the new data type. In short, custom data types allow you to convert data types to a form that a remote system will understand, but new types are usually incompatible with other platforms. Of course, you can always reduce the amount of data conversion you have to perform by simulating the new data type using a complex data type. Consider the case where you have to support a complex string. For example, SOAP doesn’t provide direct support for Pascaltype strings. You could simulate this kind of string by defining a complex type consisting of a number and a string. Here’s the Visual Basic version of such a definition. Type ComplexStringType StringLength As Integer StringData As String End Type
would hold the length of the string and make conversion easier. StringData contains the string. Here’s what the XML would look like for such a type if you wanted to use a formal definition:
StringLength
However, you don’t always need to provide a formal definition of types in your application. In fact, most developers will use the shortest form possible for convenience. Here is a perfectly acceptable alternative for most uses:
Using Complex Data Types
247
Let’s talk about the differences in the two definitions. You’ll remember from a previous discussion (see the “Attributes Versus Elements” section in Chapter 4, “Using SOAP to Create a Simple Application”) that you’ll normally want to use elements instead of attributes. Using the name attribute for the tag is acceptable, so we can get rid of the first tag. One of the two default content types for XML is , so we can eliminate that tag as well. Finally, the tag serves to restrict the acceptable types for a complex type. Since we’ve used “anytype” in this case, the restriction doesn’t exist and the tag is redundant. In fact, developers normally forgo using the tag unless they derive a new type from a base type. You can see the simplified form of complex type definition in the specification (see the first example at http://www.w3.org/TR/ 2001/PR-xmlschema-1-20010330/#Complex_Type_Definitions). However, it’s important to remember that both formal and simplified definitions exist to allow better definition of complex data types. Of course, this is just one of many complex types that you might need to create for a database application. You’ll find another write-up about complex data types in XML at http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/#element-complexType (or http://www.w3.org/TR/xmlschema-0/#element-complexType).
The example at this second location shows how to use the simpleContent tag in place of the complexContent tag shown in the example. You can use the simpleContent tag when you’re converting a simple type without elements. In other words, the simpleContent type represents a method for doing something like adding attributes to a simple type. You must use the complexContent tag when converting types such as structures.
Understanding WSDL Generator Differences Note that not all WSDL generators provide equal functionality when working with complex types. The Microsoft SOAP Toolkit WSDLGen utility will create a WSDL file with question marks that you need to remove. In addition, the modification of both the WSDL and WSML files can prove troublesome. Using the psWSDL Wizard (described in Appendix C) results in a single WSDL file that contains an entry for the user-defined type. All you need to hand code, in this case, is the type definition, as shown in this section of the chapter. In fact, the psWSDL Wizard even tells you where to place the type definition, as shown in Figure 8.1.
The psWSDL example WSDL file and associated simple component form appear in the \Chapter 08\ComplexString (Simple) directory of the source code available from the Que Web site. You can find the source code at http://www.quepublishing.com. The source code file also contains an IDL version of the component that works with pocketSOAP. The two components perform the same task, but use different techniques to do it.
Ch
8
248
Chapter 8
Providing Remote Database Access
Figure 8.1 Make sure you use the correct WSDL generator for the job when working with complex data types.
The psWSDL Wizard also fixes another problem with the WSDL file. Visual Basic won’t allow a developer to code user-defined types as ByVal. This means that when you generate the WSDL file using WSDLGen, any input arguments will also appear as outputs. The psWSDL Wizard allows you to select the in/out status of each argument, which means you get a better WSDL file output. However, as we’ll see later in this section, you’ll still need to overcome dispatch problems to make the component work with pocketSOAP. Of course, creating a WSDL file doesn’t fix a more pressing problem with the Microsoft SOAP Toolkit. This product doesn’t support UDTs using the high-level API that we’ve used for many examples so far. You must write specially defined encoders and decoders to do the job using the low-level API. Using the low-level API incurs a development penalty that some developers won’t want to pay. You might find that converting UDTs to classes is easier than creating a complex encoder and decoder when working with the Microsoft SOAP Toolkit. You can access classes directly using the high-level API, but again, you’re looking at additional coding effort if your application already uses UDTs. Third-party products do support UDTs. For example, 4S4C (see Appendix D, “SOAP for Visual C++ Developers,” for details) provides full UDT support. Figure 8.2 shows what happens when you configure this product to access a DLL containing UDTs. As you can see, 4S4C automatically generates the correct WSDL file for you. Unfortunately, this is only the server side of the equation. You’d still have to create complex code for the client side using the Microsoft SOAP Toolkit. That’s why this example relies on pocketSOAP for the client.
Using Complex Data Types
249
Figure 8.2 Some third-party products support UDTs natively, but come with other costs.
Ch
8
Describing the Interface The server-side component is an amalgam of several techniques in this example. Visual Basic alone won’t be able to create the component required for the complex data type because of limitations in the language. In this case, you can’t control the in and out parameters as well as you need to using native Visual Basic code. Visual Basic restricts your use of ByVal when working with UDTs, which means that all parameters are treated as in/out parameters, an unwanted condition when working with UDTs. Getting around the Visual Basic limitation isn’t hard. All you need to do is create a type library that explains how you want the parameters to work. The first task you need to perform is creating an Interface Definition Language (IDL) file similar to the one shown in Listing 8.1.
Listing 8.1
IDL File That Defines the Visual Basic Component Interface
// ComplexType.idl : IDL source for ComplexType.dll [ uuid(5DA0875C-5F25-4B6F-977A-681B6A820B4B), version(1.0), helpstring(“ComplexType 1.0 Type Library”) ] library ComplexType { importlib(“stdole2.tlb”); typedef [uuid(EDD4BB08-52F8-402c-9F19-0AF3312D109A)] struct ComplexStringType { short StringLength;
250
Chapter 8
Providing Remote Database Access
Listing 8.1
Continued
BSTR StringData; } ComplexStringType; [ object, uuid(15E66EB2-693E-43D2-BAE2-F87BF8E0B05E), oleautomation, helpstring(“IComplexString Interface”), pointer_default(unique) ] interface ComplexString : IUnknown { [helpstring(“method GetComplexString”)] HRESULT GetComplexString( [in] ComplexStringType* InString, [out, retval] BSTR* Result); }; };
The first five lines of the IDL file define the globally unique identifier (GUID), version, and help string for the type library as a whole. The help string is the text you see when you add a reference to this type library to your Visual Basic project. The remainder of the IDL file describes the library. Using GUIDGen to Create GUIDs If you haven’t had to create GUIDs before, you might wonder where they come from. You can create GUIDs in a number of ways using Windows API calls, but this isn’t always useful because you need to know the GUID during compile time. Visual Studio includes a utility named GUIDGen in the \Program Files\Microsoft Visual Studio\ Common\Tools directory. This tool doesn’t appear in the Visual Studio Tools menu found in the Start menu. Open the GUIDGen utility; you’ll see a dialog box similar to the one shown in Figure 8.3. Notice that you can create a variety of GUID formats. Normally, you’ll only use the registry format because it’s the most useful for developers of all programming languages. Creating a GUID is easy. Select Registry Format, and then click New GUID. Click Copy to copy the GUID to the Clipboard and paste it into your IDL file. You can use these GUIDs anywhere you need to create a registry entry. New complex data type applications require several GUIDs—three as a minimum for components of the type shown in the example. For the purpose of this application, make sure you use the GUIDs shown in the source code. Otherwise, you’ll break other code within the example and it won’t work as anticipated.
The library contains three code segments. The first is the description of the ComplexString Type complex data type. As previously mentioned, this is a structure that includes a StringLength and a StringData entry. This means the string isn’t null terminated but relies on a length variable to indicate the end of the string. Note that you need to include a typedef and a GUID for the ComplexStringType structure. The second section is an object interface definition. You’ll require a GUID for each interface in your IDL file. Listing 8.1 shows typical entries for working with Visual Basic, although IDL supports many interface definition attributes. In this case, we’re telling Visual Basic that the type library supports OLE automation and contains embedded pointers.
Using Complex Data Types
251
Figure 8.3 The GUIDGen utility is an important tool for creating IDL files manually.
Ch
8
The third section describes the ComplexString interface. This section normally contains a list of the methods you want to export using the type library and is the main reason for going this route. Note that you have complex control over the attributes for the GetComplexString method. InString is an input-only parameter. Marking Result as both output and retval (return value) means that you can set a variable equal to the output of this method within Visual Basic. It’s also important to note that this third section requires no GUID because it shares the GUID of the object as a whole. After you create an IDL file, you need to compile it using the MIDL compiler. The source code file available from the Que Web site contains a simple Make.CMD file that contains the instructions necessary to compile the IDL file and register the type library. Here’s the contents of the make file. midl ComplexType.idl /newtlb /win32 regtlib ComplexType.tlb
The second line is important. Notice that you use RegTLib, not RegSvr32, to register a type library. The RegTLib utility won’t appear on your server unless you installed Visual Studio on it, so you need to copy the utility there. You must register both the component and the type library on the server or the example application won’t work.
Creating the Server-Side Component Now that we have a type library that describes an interface and the complex data type we’ll use for this application, it’s time to create an implementation of that type library. The component doesn’t do anything fancy in this case. All it does is check the length of the string you’ve input against the length parameter and either pad or truncate the string as required. I’m using dollar signs ($) so the padding shows. Creating the component is relatively easy. All you need is a standard ActiveX DLL project; then add a reference to the ComplexType 1.0 Type Library that we created in the previous section. Listing 8.2 shows the source code for this example.
252
Chapter 8
Providing Remote Database Access
Listing 8.2
ComplexType Component Source Code
Implements ComplexString Private Function ComplexString_GetComplexString( &_ InString As ComplexType.ComplexStringType) As String Dim ResultString As String ‘Create a string of sufficient length. ResultString = InString.StringData If Len(ResultString) < InString.StringLength Then Do While Len(ResultString) < InString.StringLength ResultString = ResultString + “$” Loop Else ‘Truncate strings that are too long If Len(ResultString) > InString.StringLength Then ResultString = Left(ResultString, InString.StringLength) End If End If ‘Return it to the sender. ComplexString_GetComplexString = ResultString End Function
Remember that this component is an implementation of a type library, so the code begins by showing which interface of that type library it implements. Visual Basic will check your code to ensure you implement all methods contained within that interface. The method implementation is the ComplexString_GetComplexString() function, which returns the modified string.
Creating the Remote Client The client application relies on a serialized interface with the server. This is similar to the low-level interface employed by the Microsoft SOAP Toolkit. You’ll find that using the lowlevel interface requires a little more code, but does provide the flexibility needed to use UDTs in an application. Listing 8.3 shows the source code for the remote client.
Listing 8.3
Remote Client Source Code
Private Sub cmdTest_Click() Dim Dim Dim Dim
SOAPEnv As Object Transport As PocketSOAP.HTTPTransport Param As Variant SOAPParam As Variant
‘Define the SOAP envelope. Set SOAPEnv = CreateObject(“pocketSOAP.Envelope”) SOAPEnv.MethodName = “GetComplexString” SOAPEnv.URI = “http://winserver/soapexamples/ComplexType/” ‘Create a parameter to place within the envelope. Set Param = SOAPEnv.CreateParameter(“InString”, “”, “SOAPStruct”)
Using Complex Data Types
‘Initialize the parameter. Set SOAPParam = CreateObject(“pocketSOAP.Param”) SOAPParam.Init “StringLength”, txtLength.Text, “” Param.Parameters.Append SOAPParam Set SOAPParam = CreateObject(“pocketSOAP.Param”) SOAPParam.Init “StringData”, txtString.Text, “” Param.Parameters.Append SOAPParam ‘Send the request and receive the data. Set Transport = CreateObject(“pocketSOAP.HTTPTransport”) Transport.Send “http://winserver/soapexamples/ComplexType/soap.asp”, &_ SOAPEnv.Serialize SOAPEnv.parse Transport.Receive ‘Display the result. lblResult.Caption = SOAPEnv.Parameters.Item(0).Value End Sub
The code initially creates four variables. Notice how I’ve defined these variables. If you view the variables in a debugger, it would seem that you could provide precise declarations, but after much experimentation, I found these declarations work best. Almost every application you create with pocketSOAP will require these four variables at a minimum. The first step in creating the SOAP message is to create the envelope. Notice that pocketSOAP requires two of the same parameters as the Microsoft toolkit: a method name and a server location. You don’t need the name of a WSDL file because we’re not using one for this example. Remember that this is a low-level interface example. Creating one or more parameters comes next. This is one instance where using a complex data type varies from simple data types. Normally you’ll create a variable that contains the data you need. In this case, the parameter acts as a container for a structure containing two variables. The next step of the process adds the two variables to the parameter. Note that this is a three-step process that includes initializing a variable, adding the data to the appropriate structure member, and appending it to the envelope parameter. After you create an envelope containing one or more method calls with the appropriate parameters, you can serialize the data. The Transport variable sends the data to the SOAP.ASP on the server. SOAP.ASP looks the required component up in CONFIG.XML, instantiates the object, makes the required call, and sends the data back to the client. Notice that the Send() method requires both a server-side location and the SOAP envelope we created in the previous steps. The Transport object receives data back from the server. However, the data is still in SOAP message format, so you need to use the Parse method of the SOAP Envelope object to parse it. Parsing the data separates the response from the rest of the information so that you can display it. That’s the last step the code performs.
253
Ch
8
254
Chapter 8
Providing Remote Database Access
Testing the Complete Data Type Application Unlike many of the applications in this book, this application relies on pocketSOAP (http://www.pocketsoap.com/pocketsoap/) as a client because the Microsoft client doesn’t provide UDT support. You need to download pocketSOAP and install it according to vendor instructions. Likewise, the server relies on 4S4C (http://www.pocketsoap.com/4S4C/). Appendix D provides a detailed discussion of this product. Make sure you install 4S4C on the server using vendor instructions before you attempt to install the test component. After you have the SOAP toolkits installed, copy the test component and type library to the server. Make sure you register both the component (using RegSvr32) and the type library (using RegTLib). I created a virtual directory for my Web server that contains SOAP.ASP, SOAP_INC.ASP, SOAP-Error.HTM, CONFIG.XML, ComplexType.TLB, and ComplexType.DLL. You must modify CONFIG.XML as shown here to allow 4S4C to find the component. (Note that I split the second line of code over several lines to accommodate the book width. You normally have to maintain a single long line between the two tags.)
Note that the file contains the name of the SOAP service, the mapped (friendly) name of the component, the location of the component, the component’s program identifier, and the interface GUID. The last two entries appear in the registry. You can also find them using OLE View as we have done in other chapters. Make sure you list the interface GUID, not the GUID for the component or the data structure. Even if you get everything right, making this example work the first time can be tricky. You know that a server configuration error exists if you have all of the required files in the virtual directory and still receive an error message that the server can’t create the object. A number of problems affect this example, such as the lack of rights to execute applications. After you do configure everything correctly (including the Web server and server security), you can run the application and see the output shown in Figure 8.4. Working with UDTs is doable, but tricky. I found that after the application worked the first time, it worked reliably thereafter, making this the kind of application that only presents problems when you first install it. Figure 8.4 The output from the ComplexType example shows that you can use UDTs with SOAP.
Creating the Server-Side Component
255
Defining the SQL Server Database The database for this example is relatively simple. It contains a list of tasks the user is supposed to complete while on the road. Having a centralized task list allows an employer to add new tasks to the list while the user is on the road. The user can update his list each time he calls into the company. Figure 8.5 shows the construction of this database. Figure 8.5 A diagram of the simple database used in this example.
This example relies on Microsoft SQL Server 7.0. The scripts found in the \Chapter 08\Data directory of the source code available from the Que Web site allow you to recreate the database within SQL Server. You can find the source code at http://www. quepublishing.com. The directory also contains a backup of the data so that you don’t have to enter it manually.
As you can see, this database is relatively simple. It does include two datetime data fields so that you can see the effects of using something other than text on your application. Because datetime is a data type that SOAP supports (see Appendix A, “SOAP Data Types and Data Type Conversions”), we should be able to treat it as a simple data type. In short, the application shouldn’t require conversion routines. The Completed field is the only one that will allow remote interaction by the user—all other fields are read-only.
Creating the Server-Side Component This section of the chapter looks at the issues surrounding server-side components used for database management. In some cases, you’ll find that the components you use for a SOAP application are precisely the same components that you used for other applications created for DCOM or CORBA. In other cases, you’ll find that you need to create new components to handle data conversion or other issues that you might not have when working with binary technologies. We’ll also build a test component in this section. This component will interface directly with the database, but won’t interface with the SOAP application. In fact, you’ll find a DCOM application that can also interact with the database in the source code file available from the Que Web site in the \Chapter 08\Task Checker directory. The point is that we’re trying to separate the SOAP logic from the database management logic.
Ch
8
256
Chapter 8
Providing Remote Database Access
Many SOAP applications will work with the Microsoft SOAP Toolkit even if you don’t have the latest version of their XML parser. However, database applications tend to stretch the limits of the XML parser, so it’s a good idea to get the latest version to ensure good performance. You can download the latest version (as of this writing) at http://msdn. microsoft.com/downloads/default.asp?URL=/code/topic.asp?URL=/ msdn-files/028/000/072/topic.xml. Microsoft has also made some new tools and samples available to SOAP developers at http://msdn.microsoft.com/code/ sample.asp?url=/msdn-files/027/001/457/msdncompositedoc.xml. Finally,
you can get the latest SOAP merge modules (those required for application distribution) at http://www.microsoft.com/downloads/release.asp?ReleaseID=29556.
Tips for Working with Database Components The biggest problem with most of the database projects that I update is that the original developer used huge components that do everything but brush the administrator’s teeth. This might seem like a good approach to some developers, but it makes it hard to work on the project later. The best approach is to create multiple modules so you can add functionality as needed. That’s why components are supposed to reduce the programmer’s work level; they keep you from writing the same code repeatedly. You can keep yourself out of hot water with the boss in a number of other ways. Database components are some of the most complex components to write because the penalty for failure is so high and you’re always running into unforeseen problems. The following list presents some ways to decrease your workload, complete the project faster, force the application to run faster, and maintain a high level of reliability. ■
Don’t overwhelm yourself with details. At least a few developers whom I talked with have said that SOAP decreased their development time. One developer went so far as to say that a project that would have normally required three months with DCOM, required a mere month when working with SOAP. The secret all of these developers had was not worrying too much about details such as security. They accepted the functionality that SOAP had to offer at the outset of their project and saved time.
■
Use the low-level API only as required. Unless your application has a special need such as using UDTs, you require the very last ounce of performance, or your application won’t work with an IDispatch interface, always use the high-level API. Using the highlevel API saves considerable time and effort and it’s easier to debug.
■
Force existing components to do as much of the work as possible. You don’t want to reinvent the wheel when working with SOAP. If you already have components in place, make them perform the work. Even if you’re starting from scratch, you should consider the desktop scenario. Create components in such a way that you can use more than one access technique with small additions to the current component structure.
■
Be prepared to perform data conversions, even if it appears that you should be able to use the native database types. Most SOAP toolkits have holes in their data type coverage. Some database types are incompatible with the XML specification. Likewise, some programming languages implement standard data types or constructs in ways that
Creating the Server-Side Component
257
SOAP can’t understand. All three of these issues conspire to make it difficult to transfer anything but text consistently. ■
Always access the database in safe ways. Developers write many database components with local access in mind. This means that you have to consider the method the component normally uses to create connections and handle data. In some cases, the SOAP component will have to perform additional work to ensure safe access. This includes performing tasks such as closing the connection after each transaction.
■
Test all data exchange conditions fully. SOAP is a lot more sensitive to problems such as NULL fields than the typical desktop application. You must resolve all data access, even if it means creating an empty string to transfer across the wire. It’s also important to remember that the XML parser will strip some control characters from the data stream. For example, a carriage return/linefeed pair will become a linefeed—the XML parser will always strip the carriage return from the data stream.
■
Use Base64 data encryption as needed to transfer binary data. Many databases provide binary large object (BLOB) support. BLOBs never transfer across the wire when using SOAP, so you must convert them to a format that SOAP can transport. Fortunately, most SOAP toolkits provide conversion support as part of the package. For example, the Microsoft SOAP Toolkit performs the conversion automatically if the XSD type in the WSDL file is set to base64Binary.
■
Create a single result table to transfer to the client. A SOAP connection won’t support multiple tables. You need to perform a join on the data to create a single result set that the user can view. If the user needs another view of the data, the application will need to request a new result set from the server.
■
Always use a static database connection. SOAP won’t allow dynamic connections at this stage of the game. The best you can hope to achieve is a view of a static table and the ability to send update information to the server. In many cases, it’s also important to understand that updates might prove unreliable; therefore, they are risky.
Working with SQLXML Microsoft has added an interesting new feature to SQL Server 2000: XML support. SQL Server 2000 now accepts XML input and outputs XML to respond to client requests. Like many other XML technologies, this one uses a specially formatted document. This is a requirement because XML doesn’t specify the message format—it specifies the technique used to create the message. SQLXML is pure XML. It helps you communicate directly with SQL Server 2000 over an Internet connection.
Because SQLXML is such a new technology, information about it is a little scarce as of this writing. Of course, magazines and book authors will soon take care of any holes in coverage for this new technology. In the mean time, you can find out more about SQLXML on a few online sites. VBXML.COM (http://www.vbxml.com/people/speer/) includes an overview of SQLXML and shows how to construct SQLXML messages. This site also hosts
Ch
8
258
Chapter 8
Providing Remote Database Access
a discussion group where you can talk about SQLXML. The New England SQL Server User Group (http://www.nesql.com/downloads2.asp) has at least one presentation about SQLXML on their Web site that includes a complete example of the technology at work. Of course, you can always visit Microsoft’s site (http://www.microsoft.com/ sql/default.asp) and download an evaluation copy of SQL Server 2000 to see how this technology works. Microsoft also provides an overview of SQL Server’s XML capability (http://msdn.microsoft.com/library/techart/d51webapparch.htm) and allows you to visit one of several SQL Server newsgroups (try microsoft.public. sqlserver.xml or microsoft.public.sqlxml.viewmapper) to ask questions.
SQLXML works by creating an updategram. This is a special XML message that contains instructions on what tasks the server should perform. The actions that you can perform using SQLXML depend on the content of the schema for that database. The schema also defines the database layout and other essential details. The schema contains namespaces with desired actions. For example, the updg schema contains keywords such as before, after, and sync to determine where new records are inserted into the database. This technology also makes heavy use of stored procedures. The majority of the database management work occurs when an XML message makes a request of a stored procedure and passes it required data. This means you’re placing more of the processing burden on the database manager, which is usually the least scalable part of a database application. It’s important to consider performance when working with SQLXML—it probably isn’t the best choice for exclusive access to large applications, but could work well for users on the road. The SOAP connection occurs when you want to allow SQLXML to scale better. You can send SOAP messages to local components that can monitor server load and upload the messages as server load permits. The advantage of this method is that the components can be extremely simple—you don’t need to translate the incoming XML data for the database manager. Of course, the advantage works the other way as well. SQL Server can output XML that a component can then wrap within an XML envelope without much effort. A component that receives XML directly can work more efficiently than one that has to translate the data first. Creating a SOAP envelope is relatively simple when compared to working with recordsets directly. Theoretically, using SQLXML would allow you to create the application without creating multiple server-side components. Instead of creating both business logic and a buffer component, you can create just the buffer component. However, although this solution will save development time, it won’t scale as well as the traditional multi-component approach.
Using Multiple Server-Side Components Generally, a database application employs several layers of business logic components that massage the data before the database manager accepts it for storage. These components are the most expensive part of the database application because they contain the most complex logic. In short, if you have an existing application that you want to convert to SOAP, rewriting these components will represent a major part of the cost.
Creating the Server-Side Component
259
Most SOAP applications will require a minimum of two server-side components. The first is the database component. It provides standardized access to the database content. The second is the SOAP component. It performs these three functions. ■
Converts the database data as needed
Ch
■
Creates a buffer for the application to increase reliability and protect data Creates the SOAP message using any techniques the toolkit supports
8
■
Depending on your server load, you might want to provide other components in the loop. For example, it’s possible that the server load during the way will be so heavy that you’ll need some form of offline storage to buffer the incoming data more than memory will allow. In this case, you could add localized Queued Component (or the equivalent for the server your company uses) to place messages on disk. The server can then pick up the messages as processing capacity allows. Using a secondary buffer can help reduce the risks associated with Internet application use. It’s harder for crackers to create a denial of service (DOS) attack when the server processes messages at its own pace. Yes, it’s true that the server’s hard drive will eventually fill with messages the server can’t process fast enough, but using the secondary buffer buys you time and allows the server to continue processing requests. Some companies will require additional components if they use a layered security scheme. The message travels through several layers of firewalls and other protection before the back end server even sees it. You don’t need this level of security for a local connection, but it has become a reality for Internet connections. Each server in the chain will require a set of SOAP and server components. (This is where the SOAP actor feature comes into play— multiple servers all perform part of the processing.) The important consideration is to design the components for your applications carefully. You’ll want to design the application in a way that allows you to “bolt on” SOAP support as needed. Never use the database or business logic components on your system to support SOAP. Otherwise, you might find that you have to redesign everything from scratch later. There’s no guarantee that SOAP will provide everything your company needs; so, designing to allow other forms of connection is essential. The example in this chapter uses DCOM as the alternative connection, but you could easily use other protocols such as CORBA instead.
Generating the Code The example uses a single server-side component to access the database. This database component will allow you to transfer data in several ways, including DCOM. You’ll find a DCOM client for this component in the \Chapter 08\Task Checker directory of the source code file. Listing 8.4 shows what the code looks like. Notice that we’re performing standard database tasks—nothing about the component indicates that we’ll eventually use it as part of a SOAP application.
260
Chapter 8
Providing Remote Database Access
Listing 8.4
Server-Side Database Component Accessible from DCOM or SOAP
Public Function GetTasks(EmployeeNumber Dim TaskConn As ADODB.Connection Dim TaskRecc As ADODB.Recordset Dim ConnStr As String Dim Result() As String Dim RecordCount As Integer Dim TotalCount As Integer
As String) As Variant ‘Database connection. ‘Recordset ‘Database connection string. ‘Array of result records. ‘Total number of records in database. ‘Total number of pertinent records.
‘Create a connection string. ConnStr = “Provider=sqloledb;” & _ “Data Source=WinServer;” & _ “Initial Catalog=TaskList;” & _ “User Id=sa;Password=; “ ‘Connect to the database. Set TaskConn = New ADODB.Connection TaskConn.Open ConnStr ‘Obtain the required recordset. Set TaskRecc = New ADODB.Recordset TaskRecc.Open “TaskTable”, _ TaskConn, _ adOpenStatic, _ adLockOptimistic, _ adCmdTable ‘Determine the number of pertinent records. TotalCount = 0 For RecordCount = 0 To TaskRecc.RecordCount - 1 If TaskRecc!Employee = EmployeeNumber Then TotalCount = TotalCount + 1 End If TaskRecc.MoveNext Next ReDim Result(TotalCount - 1, 5) ‘Reset the variables. TaskRecc.MoveFirst RecordCount = 0 TotalCount = 0 ‘Create an array of task record items. For RecordCount = 0 To TaskRecc.RecordCount - 1 ‘Verify the record item is for the current employee. If TaskRecc!Employee = EmployeeNumber Then Result(TotalCount, 1) = TaskRecc!Employee Result(TotalCount, 2) = TaskRecc!Task_Name If Not TaskRecc!Task_Description = “” Then Result(TotalCount, 3) = TaskRecc!Task_Description Else Result(TotalCount, 3) = “No Task Description Available” End If Result(TotalCount, 4) = TaskRecc!Date_Assigned If Not TaskRecc!Completed = “” Then
Creating the Server-Side Component
Listing 8.4
261
Continued Result(TotalCount, 5) = TaskRecc!Completed Else Result(TotalCount, 5) = “Not Completed” End If
TotalCount = TotalCount + 1 End If ‘Move to the next record. TaskRecc.MoveNext Next ‘Return the task array to the caller. GetTasks = Result ‘Close the recordset and database. TaskRecc.Close TaskConn.Close ‘Clear the variables. Set TaskRecc = Nothing Set TaskConn = Nothing End Function Sub SetComplete(EmployeeID As String, TaskName As String, Assigned As Date) Dim TaskConn As ADODB.Connection ‘Database connection. Dim TaskRecc As ADODB.Recordset ‘Recordset Dim ConnStr As String ‘Database connection string. Dim SearchStr As String ‘Database search string. ‘Create a connection string. ConnStr = “Provider=sqloledb;” & _ “Data Source=WinServer;” & _ “Initial Catalog=TaskList;” & _ “User Id=sa;Password=; “ ‘Connect to the database. Set TaskConn = New ADODB.Connection TaskConn.Open ConnStr ‘Create a search string. SearchStr = “Select * From TaskTable Where “ & _ “Employee = ‘“ & EmployeeID & _ “‘ and Task_Name = ‘“ & TaskName & _ “‘ and Date_Assigned = ‘“ & Assigned & “‘“ ‘Obtain the required recordset. Set TaskRecc = New ADODB.Recordset TaskRecc.Open SearchStr, _ TaskConn, _ adOpenStatic, _ adLockOptimistic, _ adCmdText ‘Set the completed field to the current date. TaskRecc!Completed = Date
Ch
8
262
Chapter 8
Providing Remote Database Access
Listing 8.4
Continued
‘Update the record. TaskRecc.Update ‘Close the recordset and database. TaskRecc.Close TaskConn.Close ‘Clear the variables. Set TaskRecc = Nothing Set TaskConn = Nothing End Sub
As you can see, this component has two methods. The first, GetTasks(), returns an array of tasks for the employee making the request. The second, SetComplete(), will place a date in the Completed field for the user. Given the simple nature of the database we’re using, these two functions provide nearly full control (at least as much as you would want a remote user to have over your system). Notice the lack of destructive commands, such as the ability to erase a record. Both methods begin by creating a connection to the database. This includes creating a connection object and a recordset. The GetTasks() method obtains the entire table for the recordset, while the SetComplete() method returns a single record based on the user’s current position within the task list. It’s possible to increase the efficiency of the GetTasks() method by using a more selective record-retrieval method, but the method shown works fine for smaller databases, such as the one in the example. The GetTasks() method performs several processing steps on the recordset. First, it determines how many records the database contains for the user in question. It uses the result of this counting operation to redim a dynamic array. After the component knows how big to make the array, it resets all of the variables and fills the array with data. Notice the checks for NULL values for fields that allow NULLs in the database. Failure to check for NULLs will cause the application to fail.
Always use dynamic arrays when working with databases because you don’t know in advance how many records the array will need to store. In addition, Visual Basic will display an error if you try to copy a static array to a variant. This is a known issue and Microsoft provides suggestions for working around it in the Visual Studio documentation. Unfortunately, dynamic arrays are difficult to transfer using SOAP. We’ll see later in this chapter how this affects the application. Arrays represent one of the major trade-offs of using SOAP for your application; Visual Basic and SOAP have completely different requirements in this arena.
The SetComplete() method has life a little easier. After it obtains a copy of the recordset, it changes the date in the Completed field and updates the record. Theoretically, the change should take place immediately. However, even if the change does occur immediately, the constraints placed on the application by SOAP means that at least some users would have
Creating a Middle-Tier Component
outdated information loaded on their machines. If providing updated information is a priority for your application, you’ll need to add some type of timed update to the application. Events might not work very well, in this case, because SOAP applications don’t maintain a consistent connection with the server, something that events rely on to do their job. Both components end by closing the connection and recordset. Keeping track of resource usage with SOAP applications is essential because they tend to request more resources than a desktop application in many cases. For example, SOAP applications require a new connection every time they make a request. Therefore, you need to check code for proper database closures and object releases. Although the system will compensate for some level of negligence in this area, failure to release resources will eventually eat away at application performance and could cause the server to crash when resources are exhausted.
Creating a Middle-Tier Component You might ask what could be so different about creating the middle-tier component than simply moving data from the server to a SOAP message. The problem is that SOAP doesn’t handle odd data formats very well. The server-side component in Listing 8.4 outputs an array using a variant. Theoretically, SOAP will handle an array without problems. In fact, if we create a component that simply passes the data along as shown in Listing 8.5, the various WSDL generators will generate the correct output (except for 4S4C in this case). Figure 8.6 shows typical output.
Listing 8.5
Just Passing the Data to SOAP as Shown Here Won’t Work
Public Function GetTasks(EmployeeNumber As String) As Variant ‘Create a task database interface object. Dim GetTask As DataAccess.TaskTable Set GetTask = New DataAccess.TaskTable ‘Get a list of current tasks. TaskTable = GetTask.GetTasks(EmployeeNumber) ‘Pass the list to the client. GetTasks = TaskTable ‘Clean up the objects. Set GetTask = Nothing End Function
Notice that the code looks fine, as does the WSDL output. The WSDL Generator utility detects the date variable for the SetComplete() method and handles the variant as an xsd:anyType. The problem is the line highlighted in Figure 8.6. If you run this code as is, you’ll receive the error message shown in Figure 8.7 from the client. Notice that the error message states that the SoapMapper was unable to work with the anyType variable. Fixing this problem means performing one of three tasks. You can hand edit the WSDL file to replace the anyType value with an array, create an IDL file, or change
263
Ch
8
264
Chapter 8
Providing Remote Database Access
the data into a format that’s more palatable for SOAP. (See the complex data type example earlier in the chapter for an example of an IDL file.) Figure 8.6 The WSDL output looks correct, but won’t generate a good result.
Figure 8.7 SOAP will generate an error message for the client when working with certain types of variants.
Many professionals suggest you avoid using the variant type for SOAP applications. XML does support this data type, and so does SOAP. The problem is that the SOAP mapper often gets confused when working with variant data and assumes that it can’t place the requested data within the variable. Arrays are simple constructs to place within a variant, yet most developers report they can’t use them with any SOAP toolkit. Trying to map complex data such as images into a variant will fail more often than not. In most cases, you’ll want to use Base64 encoding for complex data and avoid using variants even if they are supported by your SOAP toolkit.
Creating a Middle-Tier Component
We’ll use the third technique in this section. The reason that I chose this method is that it takes the least time to implement and promises the greatest probability of success on the first attempt. Listing 8.6 shows a different version of the SOAP component that modifies the data into a string that we’ll parse on the client end.
Listing 8.6
Ch
8
A SOAP Component That Works
Public Function GetTasks(ByVal EmployeeNumber As String) As String Dim RecordCount As Integer Dim TaskString As String ‘Create a task database interface object. Dim GetTask As DataAccess.TaskTable Set GetTask = New DataAccess.TaskTable ‘Get a list of current tasks. TaskTable = GetTask.GetTasks(EmployeeNumber) ‘Convert the array into a string that we can parse. TaskString = “” For RecordCount = 0 To UBound(TaskTable) TaskString = TaskString & TaskTable(RecordCount, TaskString = TaskString & TaskTable(RecordCount, TaskString = TaskString & TaskTable(RecordCount, TaskString = TaskString & TaskTable(RecordCount, TaskString = TaskString & TaskTable(RecordCount, Next
265
1) 2) 3) 4) 5)
& & & & &
vbCrLf vbCrLf vbCrLf vbCrLf vbCrLf
‘Pass the list to the client. GetTasks = TaskString ‘Clean up the objects. Set GetTask = Nothing End Function Sub SetComplete(ByVal EmployeeID As String, _ ByVal TaskName As String, _ ByVal Assigned As Date) ‘Create a task database interface object. Dim GetTask As DataAccess.TaskTable Set GetTask = New DataAccess.TaskTable ‘Pass the completed field data to the server. GetTask.SetComplete EmployeeID, TaskName, Assigned ‘Clean up the objects. Set GetTask = Nothing End Sub
This version of the code generates a much smaller WSDL file, as shown in Figure 8.8. Notice that we’re using a string to transfer the data now. The string will only incur a small increase in data size and shouldn’t affect performance.
266
Chapter 8
Providing Remote Database Access
Figure 8.8 Using the alternative code produces a smaller WSDL file that relies on simple data types.
Let’s discuss the SOAP component code in a little more detail. Notice that the component contains two methods, just like the database component, and that it uses the same method names. A few developers would say this causes confusion, but you want to maintain some level of consistency at the client end of the SOAP application picture. Both components use the same calling convention. The SOAP component returns a string instead of a variant for the GetTasks() method call. Both methods begin by creating an instance of the database component we discussed in the previous section. In this regard, they act just as a client-side application would. Comparing this component code with the DCOM client example on the Web site will show some amazing similarities. The GetTasks() method retrieves the array of records from the database component. It uses a loop to convert the individual records into a single string for transfer to the client. The example code uses a carriage return/linefeed combination between each field entry. The XML parser will strip off all of the carriage returns, but the linefeeds will remain in place. You could use any special character to provide separations in the data for later parsing. However, using a carriage return/linefeed pair makes it easier to read the data using utilities such as tcpTrace. It’s also unlikely that you’ll ever run into either a carriage return or a linefeed within the database data, so this pair represents a reliable parsing character. The SetComplete() method simply passes the data from the client to the server. In this case, we need to worry about the date variable. However, most SOAP toolkits provide good date and time type support, so it shouldn’t be too much of an issue.
Creating the Client-Side Application
267
Creating the Client-Side Application The client application requires more code than any other application in this book. No matter how you get data from the server to the data, it always requires some amount of interpretation and formatting. SOAP adds to this burden by forcing you to convert data in many situations where you wouldn’t need to with a desktop application. Listing 8.7 shows the utility methods for this example.
Listing 8.7
Utility Routines for the Client Application
Private Sub cmdQuit_Click() ‘Exit the program. End End Sub Public Sub DisplayText() ‘Display the fields within the current record. txtEmployeeID.Text = TaskTable(RecordNum, 1) lblTitle.Caption = TaskTable(RecordNum, 2) lblDescription.Caption = TaskTable(RecordNum, 3) lblAssigned.Caption = TaskTable(RecordNum, 4) lblComplete.Caption = TaskTable(RecordNum, 5) End Sub Public Sub ParseData() Dim TextPos As Integer Dim RecordCount As Integer ‘Calculate the number of records. TextPos = 1 RecordCount = 0 Do RecordCount = RecordCount + 1 TextPos = TextPos + 1 TextPos = InStr(TextPos, TaskString, Chr(10)) Loop While TextPos > 0 RecordCount = RecordCount / 5 ‘Dimension the table to hold the number of records. ReDim TaskTable(RecordCount - 1, 5) ‘Parse the data in a loop. RecordCount = 0 Do TaskTable(RecordCount, 1) = Left(TaskString, _ InStr(TaskString, _ Chr(10)) - 1) TaskString = Right(TaskString, _ Len(TaskString) - InStr(TaskString, _ Chr(10))) TaskTable(RecordCount, 2) = Left(TaskString, _ InStr(TaskString, _ Chr(10)) - 1)
Ch
8
268
Chapter 8
Providing Remote Database Access
Listing 8.7
Continued
TaskString = Right(TaskString, _ Len(TaskString) - InStr(TaskString, Chr(10))) TaskTable(RecordCount, 3) = Left(TaskString, _ InStr(TaskString, _ Chr(10)) - 1) TaskString = Right(TaskString, _ Len(TaskString) - InStr(TaskString, Chr(10))) TaskTable(RecordCount, 4) = Left(TaskString, _ InStr(TaskString, _ Chr(10)) - 1) TaskString = Right(TaskString, _ Len(TaskString) - InStr(TaskString, Chr(10))) TaskTable(RecordCount, 5) = Left(TaskString, _ InStr(TaskString, _ Chr(10)) - 1) TaskString = Right(TaskString, _ Len(TaskString) - InStr(TaskString, Chr(10))) RecordCount = RecordCount + 1 Loop While Len(TaskString) > 0 End Sub
_
_
_
_
The application uses a separate DisplayText() function to save development time. As we’ll see in the next section, the application needs to display data from several different locations. There isn’t anything too special about this routine except that the RecordNum variable tracks the current record within the TaskTable array. The ParseData() function requires a little more explanation. The code begins by calculating the number of records the application received. The SOAP component in the previous section could have passed the number of records as part of its output, but it’s better to calculate this number at the client so that you can account for any lost records. Lost data is always possible when using an Internet connection; so, calculating the number of records at each endpoint is essential. After the application knows the number of records, it can ReDim the TaskTable array and begin placing data within it. Notice that the parsing routine uses standard text manipulation functions to locate the position of linefeeds within the data stream. As previously mentioned, the XML parser strips any carriage returns from the data stream, so you won’t find any. It’s important to parse the data carefully. You not only have presentation concerns to worry about, but interaction with the database to consider. A stray control character will contaminate the data within the array. Passing this contaminated data back to the server (as we will with the SetComplete() method) results in searches that fail and other problems. Given the inability to provide robust error handling with SOAP, you’ll spend a lot of time debugging applications to find errors in parsing. In short, test your parsing routine thoroughly before placing it in the application.
Creating the Client-Side Application
One of the biggest issues in creating a database application is safely moving the data from the server to the client. The client requires some type of mechanism to control precisely which data the server returns. This is especially important with SOAP application because you’ll likely move data across connections that are slower than the ones provided by the typical network. Listing 8.8 shows the data exchange methods for this example.
Listing 8.8
Data Exchange Routines for the Client Application
Dim TaskTable() As String Dim TaskString As String Dim RecordNum As Integer
‘Task Record Array ‘Unparsed Task Record String ‘Current Record Number
Private Sub cmdCompleted_Click() ‘Create the SOAP client. Dim Client As SoapClient Dim Result As String Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://localhost/soapexamples/TaskList/SOAPData.WSDL”, _ “SOAPData”, _ “TaskTableSoapPort” ‘Set the completed field. Client.SetComplete txtEmployeeID.Text, _ lblTitle.Caption, _ lblAssigned.Caption Set Client = Nothing ‘Display the current task again. Set Client = New SoapClient Client.mssoapinit _ “http://localhost/soapexamples/TaskList/SOAPData.WSDL”, _ “SOAPData”, _ “TaskTableSoapPort” TaskString = Client.GetTasks(txtEmployeeID.Text) ParseData DisplayText Set Client = Nothing ‘We’re finished, so exit. Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + Client.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + Client.faultstring + vbCrLf + vbCrLf + _ “Fault Actor: “ + Client.faultactor + vbCrLf + vbCrLf + _ “Detail: “ + Client.detail
269
Ch
8
270
Chapter 8
Providing Remote Database Access
Listing 8.8
Continued
MsgBox ErrorMessage, vbExclamation End Sub Private Sub cmdTask_Click() ‘Create the SOAP client. Dim Client As SoapClient Dim Result As String Dim ErrorMessage As String ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://localhost/soapexamples/TaskList/SOAPData.WSDL”, _ “SOAPData”, _ “TaskTableSoapPort” ‘Get a list of current tasks. TaskString = Client.GetTasks(txtEmployeeID.Text) ParseData ‘Set the current record number. RecordNum = 0 ‘If there is more than one record, enable ‘the Next button. If UBound(TaskTable) > 0 Then cmdTask.Enabled = False cmdTask.Visible = False cmdNext.Enabled = True cmdNext.Visible = True cmdNext.SetFocus cmdPrevious.Enabled = True cmdPrevious.Visible = True End If ‘Enable the Completed button. cmdCompleted.Enabled = True cmdCompleted.Visible = True ‘Display the current task. DisplayText Set Client = Nothing ‘We’re finished, so exit. Exit Sub ‘Display a message when an error occurs. ErrorHandler: ErrorMessage = “Fault Code: “ + Client.faultcode + vbCrLf + vbCrLf + _ “Fault String: “ + Client.faultstring + vbCrLf + vbCrLf + _ “Fault Actor: “ + Client.faultactor + vbCrLf + vbCrLf + _
Creating the Client-Side Application
Listing 8.8
271
Continued
“Detail: “ + Client.detail MsgBox ErrorMessage, vbExclamation End Sub
Both the cmdCompleted_Click() and the cmdTask_Click() methods begin the same way as many other examples in the book. They create a SOAP client and set up error handling. Notice that they both use the same access technique. The client connection controls access to the component as a whole, not to an individual method within the component. One point of interest is that the cmdCompleted_Click() method contains two invocations of the SoapClient. After many hours of experimentation, I found that SOAP normally drops the connection between client and server long before you can make two method calls to the remote server. The dropped connection produces an error message similar to the one shown in Figure 8.9. As a result, you must re-create the SOAP connection for every method call, unlike desktop applications where the connection remains intact. This requirement emphasizes the fact that SOAP provides a one-way, one-time connection to the server. Note that both methods provide extended error information—you’ll need this additional information to debug most database applications. Figure 8.9 Dropped HTML connections produce an error message similar to the one shown here.
The cmdCompleted_Click() method calls the SetComplete() method first. Notice that the three input values come from the client application dialog box. This is the best place to get the information because it shows the current task. Calling SetComplete() modifies the database, but it doesn’t update the display. The cmdCompleted_Click() method also calls on GetTasks() to update the TaskString and ParseData() to update the TaskList array. Finally, it displays the updated information onscreen. The cmdTask_Click() method displays the task data using the same techniques as cmdCompleted_Click(). cmdTask_Click() also enables various buttons depending on how many task records the application retrieves. The application always enables the cmdCompleted button after it retrieves the first record. The last two methods for this example appear in Listing 8.9. Both methods move from record to record by changing the RecordNum value and displaying the new record on screen. Both methods also check for invalid RecordNum values. They display an appropriate message when the user is already at the end of the list.
Ch
8
272
Chapter 8
Providing Remote Database Access
Listing 8.9
Movement Routines for the Client Application
Private Sub cmdNext_Click() ‘Get the next record. If RecordNum < UBound(TaskTable) Then RecordNum = RecordNum + 1 DisplayText Else MsgBox “Already at last task!”, _ vbExclamation Or vbOKOnly, _ “Record Position” End If End Sub Private Sub cmdPrevious_Click() ‘Get the previous record. If RecordNum > 0 Then RecordNum = RecordNum - 1 DisplayText Else MsgBox “Already at first task!”, _ vbExclamation Or vbOKOnly, _ “Record Position” End If End Sub
Testing the Application Depending on your company’s requirements, database applications normally require several levels of testing. For this example, I performed three levels of testing: ■
Local testing using a data editing application
■
Remote testing using DCOM to simulate a desktop application
■
Remote testing using SOAP to simulate client access across the Internet
Local testing is important because it allows you to test the database implementation without incurring the cost of remote transactions. I used the simple data editor shown in Figure 8.10 for this example. This data editor also comes in handy for testing new records and resetting the database after a test. Make sure you test everything the database has to offer during local testing. This means testing any stored procedures, checking views, and trying the various indexes. It’s relatively easy to change the configuration of the database during the early design phases. Creating a local test program allows you to check the database thoroughly to ensure your design has no holes before you begin remote testing. Using DCOM or some other binary technology to perform remote testing isn’t a requirement. However, you’ll find that you can test the server-side components better if you use this intermediate strategy. Using DCOM allows you to test the server-side database components separately from the SOAP components. Performing this individual level of testing
Testing the Application
273
reduces the number of places you have to look for errors. Figure 8.11 shows the client application interface used for both the DCOM and the SOAP versions. Figure 8.10
Ch
Local tests are essential if you want to check the database for errors.
8
Figure 8.11 The Task Checker user interface presents the task list information in a non-editable format.
You won’t have access to your component after testing it with SOAP. The system locks the DLL file because the DLL is still loaded in memory. Use the IISRESET command-line utility to stop and start IIS to unload the component. Using this technique will allow you to update the DLL as needed for testing purposes without having to restart the machine. (Unfortunately, simply starting and stopping IIS won’t accomplish anything.) You can achieve the same effect by stopping and starting IIS from the Services MMC snap-in. A third option is to use net stop iisadmin /y to stop IIS, then net start w3svc to restart it. Some developers place these two commands in a batch file so they only have to remember one command. The IISRESET command is the fastest method, however, so it’s probably the best option for developers with local system access.
274
Chapter 8
Providing Remote Database Access
After you begin testing the SOAP version of the application, you need to check the SOAP component and how it interacts with the database component on the server. You perform most of the business logic testing as part of creating a DCOM client. The SOAP application is a “bolt on” to the existing DCOM (or other binary protocol) application. For many developers, performing all of these levels of testing is out of the question. They find that their testing time is short enough as it is without adding complexity to the testing process. The problem with testing a complex database application after you integrate all of the components is that you’ll find it’s difficult to locate problem components. The reason for all of these levels of testing is to simplify the testing process, making it easier to locate bugs and fix them.
Quick Fixes for Remote Database Applications Database applications are becoming more complex every day. As companies create new sources of information and combine existing ones, database developers discover the complexity of the applications they create increases exponentially. In addition, companies are more global than they were in the past. The issue of 24/7 up-time is becoming the norm, and many developers find that it places additional requirements on their applications. No longer can developers rely on a maintenance period to compress data, re-create indexes, and generally clean up the application environment.
If you’re worried about the number of interdependent specifications and standards that your SOAP applications rely upon, you’re not alone. Many developers have noticed that XML in general is becoming a mass of interdependent specifications, and organizations such as W3C seem determined to output more as quickly as possible. I was recently online and noticed that one person had put all of the pieces together into a comprehensive (if not comprehendible) list. You can read more about the problems of interdependent standards at http://www.xml.com/pub/a/2001/02/21/deviant.html. Although this represents the opinion of one person, it also provides a good view of the problems most developers are facing with this ever-expanding technology.
Time is the enemy of the database developer. When a company experiences down time for other application types, it might affect just a few people. The requirement for getting the application fixed isn’t as great as when a broken database application affects a majority of the company. That’s why this section is titled “Quick Fixes for Remote Database Applications.” Not only do database applications present special challenges, but also few people are willing to wait for you to fix them.
Many issues can affect the reliability and usability of your application when using SOAP. For example, some developers are concerned that Microsoft programmed many parts of the Microsoft SOAP Toolkit using a single threaded apartment (STA). This can cause problems when developers call the various objects, such as HttpConnector, from a
Quick Fixes for Remote Database Applications
275
multi-threaded apartment (MTA) application. The bottom line is that you need to understand how the developer created your SOAP toolkit to ensure you can use it in all required application scenarios. Ch
We’ve already discussed many potential problem areas as part of the examples in this chapter. However, there are other problems that you should consider during the design, development, and debugging stages of your application. The following sections discuss some database scenarios that are specific to SOAP.
Loss of Connection The Internet presents many ways to lose a connection. Line noise and other hazards prevent even the best connection from working all of the time. In addition, you have to consider the effects of loading because most ISPs overbook their line capacity. In other words, you’ll run into situations where there isn’t enough bandwidth bandwidth to support a database application fully because the ISP has sold part of the anticipated bandwidth to another customer. Some ISPs do sell guaranteed capacity at a huge markup. Having more than one ISP is more likely to provide the level of redundancy you require for a database application. You also need to consider SOAP-specific issues. Most SOAP toolkits provide a watchdog feature. This feature looks for open connections without activity. If the timeout value expires, the connection is lost and the client receives an error message. Most of these toolkits also allow you to set the timeout value for the connection. Some developers might consider setting the timeout to a high value to ensure that the connection never fails due to internal processing. However, using a high value sets your application up for a “hung” state if the connection really does fail. Setting the timeout value requires experimentation in a real world setup with your particular application. Several factors affect the setting, including the number of users, type of connection, amount of data, and level of processing. An order entry system might require a longer timeout than a catalog search application simply because the order entry system experiences a higher level of activity and passes more data between the client and server. SOAP connections also fail when the unexpected happens. You might find that certain client and server combinations will disconnect, rather than display an error message. This happened with several of the test applications in the book, resulting in code and tool changes. In theory, SOAP applications should communicate errors, but the reality is that vendors don’t always anticipate error-handling scenarios correctly, and the client or server side of the application crashes.
Odd Data Entry Errors Any time you convert data from one format to another, you risk loss of content or misinterpretation of data. The fact that SOAP forces at least two data conversions for every data transfer means that there’s a high probability you’ll see conversion errors sometime during your coding experience. Database applications often require more than just the two forced
8
276
Chapter 8
Providing Remote Database Access
conversions. For example, the second program in this chapter requires a parsing mechanism at both ends of the application to convert an array to a string and back. In short, the sample application requires five total conversions: database to array, array to string, string to XML, XML to string, and string to array. The fastest way to correct data entry problems is to create the data entry components and create a test suite to check the output of each component individually. If you can reduce the code to a black box of component implementations, integration testing becomes much easier. Make sure you check all types of data input a component might receive, including those that the component should never have to handle. Always test the component’s ability to handle NULL input and empty values. It’s especially important with strings to check that the data you input is the data that the component outputs as well. One example in the book caused many problems until I discovered the output from a component was one character (a space no less) too short. Errors like this are difficult to locate and fix. Always create a test database as part of the testing process. I create a loopback test application that receives output from a database through a series of components, turns the data around, and sends it right back to the database using the same components. The data you see in the database at the end of the test should precisely match the original data. Don’t forget that you can use applications such as tcpTrace in these scenarios to log the flow of data. This allows you to view the data later for discrepancies in handling by specific components.
Performance Issues Early adopters of SOAP are already complaining that database performance is much worse than when using alternatives such as DCOM and CORBA. The problem is that a database application transfers fields of information within individual records. The data can grow by as much as 10 times its normal size due to the tags that XML uses to define the schema. Obviously, bandwidth begins to play a major role in performance at this point. Therefore, the question of performance becomes one of reducing data growth in the XML portion of the message. One way to do this is to use smaller tags. However, generating a message by hand to ensure the tags stay small is impossible. The automated way to do this is to ensure you use smaller variable names for component method arguments. Make sure the name is descriptive, yet short enough to keep the resulting tags small when automatically generated by an application. Another potential fix for this problem is to hand edit the WSDL file used to transfer the data. The problem with this approach is that even a small editing error can keep your application from running at all. However, careful hand editing will help reduce performancerelated problems in your application. I had previously mentioned that one of the testing phases you should use for a database application is to check the server-side components using a DCOM or CORBA client. Many vendors provide utilities to stress test components under such conditions. In some cases, developers avoid stress-testing their application because it’s time consuming and the performance benefits gained are small compared to the time required to implement the changes.
Addressing Transaction Issues
277
SOAP applications are different—you need every advantage you can get to reduce the size and processing time of the message. A few developers write their own XML parser to gain a speed advantage. Although writing an XML parser isn’t a daunting task, it could become time consuming. However, finding a fast XML parser for your system should be a top priority. It isn’t hard to imagine that magazines will begin to review the merits of various XML parsers by the time you read this chapter.
Server Is Busy or Missing Objects If you receive a message that the server is busy, that the SOAP object is missing, or that the server experienced an unknown error, don’t feel alone—these are frequent problems for everyone. However, these problems normally occur because something simple is wrong with the server setup. This is especially true with test servers because you’re constantly replacing old component versions with new ones. Always begin fixing this problem by registering your components and type libraries again, and then rebooting the server. The server will often read a copy of your component into memory, and fail to release it, even when you start and stop the Web server. Clearing the server’s memory is one way to ensure you start with a fresh setup.
You can check for component registration and configuration errors by creating a local test application and running it on the server. If the test application can successfully interact with the component, the problem isn’t one of registration. In fact, you can probably rule out component configuration problems as well.
Make sure you check for component configuration problems. For example, COM+ applications rely on a separate utility to configure the role-based security they use. Some components require registry entries to determine where they are supposed to run. They normally include a separate utility that you’ll use for configuration purposes. Many SOAP toolkits require external configuration files that you need to check. For example, 4S4C requires entries in a CONFIG.XML file that tell where to find the component and which component to use. Unfortunately, if you make a mistake in this file, you need to fix it and then stop and restart the Web server. Like the Microsoft SOAP Toolkit, 4S4C reads its configuration files into memory to enhance performance. Another persistent problem is one of Web server configuration. If you use virtual directories, make sure the directory setup is correct as well. For example, some operating systems will prevent access to a virtual directory if you configure one set of security rules from the Web server and another from the server’s file system configuration utility. Look for every potential source of configuration error as part of your debugging strategy.
Ch
8
278
Chapter 8
Providing Remote Database Access
Addressing Transaction Issues At the time of this writing, SOAP doesn’t provide a form of native transaction support. Many developers feel this is a major concern because they’re used to using transactions for every desktop application. The first question that you should ask yourself is whether your application actually requires the use of transactions. Consider the importance of the data and the worst-case scenario if the data is either lost or damaged. In at least some cases, you’ll find that transactions aren’t required. For example, transactions aren’t necessary for applications that only allow the user to view the data in question without modification. Likewise, applications in which the user can only add or modify non-critical data can probably exist without benefit of transactions. No matter what you do, many applications are still going to require transactions. The data isn’t critical enough that you have to keep it off the Internet, but it’s also critical enough that you need to protect it in some way. The big thing to remember is that you’ll incur data loss when using SOAP because the Internet is inherently unsafe.
The Internet is unsafe. If you have any doubts about this statement, spend some time reading about the security breaches that occur every day. If your data is so critical that you can’t afford loss or damage, your best bet is to keep it off the Internet. This book provides you with some ideas of how to circumvent problems with data transfer on the Internet. However, none of these methods guarantees specific results, and you might find that you still lose data. Always provide some means of recovering the data and keep in mind that you’re going to lose some data along the way.
Some developers have come up with interesting ways to provide at least a modicum of transaction support for their applications. One technique that seems popular is to add three elements to every SOAP message: user ID, session ID, and package ID. Using these three elements allows you to track the progress of every piece of data that your application transfers. However, using this technique requires that you maintain state information outside of the application. This means that there’s a chance the transaction information could get out of sync with the actual data transfer and you’ll lose data anyway. Another popular technique is to use cookies to track the progress of a data transfer. However, this means that you must set up both systems to use the cookies and agree on a methodology for using them. Some developers also complain that using the cookie method is overly complicated.
Transactions are a hot issue for developers right now. You can find a wealth of information on the topic online. TNL.NET (http://www.tnl.net/newsletter/2001/ soapsecurity.asp) includes a newsletter that addresses the issue of SOAP security during transactions. You can find a preliminary demonstration of SOAP transactions at http://www.xbrlsolutions.com/public/demos/crossreference/Soap/Post
Troubleshooting
Transaction.asp. You’ll find at least one white paper on the topic of transactions on the SOAP Web Resource Center (http://www.soap-wrc.com/webservices/ default.asp). Perl users will want to read the Quick Start with SOAP (http://www. perl.com/pub/2001/01/soap.html) white paper that also includes information
about transaction support. A number of other sites offer small tidbits of information about transactions for specific languages. Of course, you can always read about SOAP transactions in the specification (http://www.w3.org/TR/SOAP/#_Toc478383497).
Obviously, these two techniques are inadequate for today’s data transfer needs. The vendors involved with SOAP are working on a new set of specifications that will allow developers to use transactions in a consistent fashion. After vendors standardize the method for using transactions, you’ll be able to use SOAP for data that is more critical than your average task list. At the time of this writing, transactions are more a gleam in the eye of some vendors than something written down on paper. It will take some time for vendors to devise a good solution to the problem of transactions—one that will honestly work for at least the majority of SOAP users.
Troubleshooting This chapter has shown you how to work with database applications under SOAP. It’s important to understand that SOAP fixes some problems but presents other challenges. Some developers consider these challenges so severe that they won’t use SOAP for databaseoriented tasks. SOAP does increase certain database-related risks as we’ve discussed. However, those risks are small enough that you can use SOAP for some types of database applications. The following sections will examine some of the questions developers ask about using SOAP for database management and hopefully provide the answers you need to add SOAP to your database management application toolkit. (Always feel free to contact me at
[email protected] if you have additional questions.)
How Do I Detect SOAP-Related Database Errors? You should begin by using the same techniques that desktop applications use. Make sure you create components that filter data and look for obvious errors, such as data that’s out of range or doesn’t follow some other criterion. Transaction processing is the safest way to transfer data from one point to another, but many developers use alternatives because transactions exact a heavy performance toll. Until SOAP provides transaction support, your best bet for finding problems is to perform detailed data analysis. Of course, you have to have data to analyze, and SOAP isn’t guaranteed to get the data from one point to another. In fact, the Internet is a decidedly unsafe way to transfer data because the data can become lost at any point. Another method you want to use for database monitoring is to keep a local copy of the data on the user’s machine until the user receives verification that the data ended up at the server in good condition. Most of the applications in this book show how to use a response mechanism, so sending a receipt isn’t a problem.
279
280
Chapter 8
Providing Remote Database Access
You do need to consider a subtle problem when using receipts. The data could end up at the server in good shape, but you might not receive the receipt. Perhaps the connection became bad after the data transfer, but before the receipt was transmitted. Maintaining copies of records on the machine is a good idea, but you shouldn’t transmit them again automatically. Instead, mark these records for a manual check after the user gets back to the office to ensure that the database contains only one copy of the data. The best strategy for working with databases is to keep critical data off the Internet. If the data is so important that there’s a 0% tolerance for losing it, then you need to use methods that are safe to transfer it. This means using a local connection and a binary transfer protocol. Using SOAP to transfer critical data will result in an unrecoverable loss at some point.
Are the Database Shortcomings of SOAP Permanent? Vendors are constantly working on improving SOAP. For example, you can read about eventual changes to how the SOAPAction element works at http://lists.w3.org/Archives/ Public/xml-dist-app/2001May/0026.html. Remember that this protocol is a standard in the rough—it’s barely ready for use by the general public. Given the number of companies who have already developed products for SOAP and the number of new products that will appear on the scene soon, the future of SOAP looks bright. There’s little doubt that vendors will fix the problems that SOAP has today and you’ll be able to use it for other purposes. The two big issues that SOAP innovators need to solve to make SOAP a truly useful database platform are transactions and security. SOAP will never transfer data with the 100% assurance that companies require until it supports transactions. Likewise, given the open nature of the Internet, vendors won’t use SOAP for critical data until it includes built-in security. This means some form of encryption and digital certificate support. IBM has already come out with one solution for the security problems at the time of this writing, but it isn’t a standard solution. SOAP will need a standardized security solution to survive. Some developers would add a third issue to this discussion. The SOAP specification currently promises support for attachments, but no one has implemented this support so far. Several vendors are working on attachment support for their products, and you might see some of those solutions on the market by the time you read this. It’s important that they implement attachments using a standardized methodology, so it might be a while before you see SOAP attachments that truly work across all platforms.
What Are the Top Ten Issues for SOAP Database Developers? SOAP database developers need to exercise more care than any other SOAP user. SOAP is a great protocol, but it’s hardly a tested product. The following list presents the ten issues that I feel database developers need to consider the most when creating SOAP applications: 1. Always verify that you can convert database data to something with which XML will work. 2. Always verify with which platforms your application will need to work because each platform presents special challenges.
Troubleshooting
3. Always use small test cases to test the viability of your database project before you commit to the larger project. 4. Always validate the feature set of the SOAP toolkit because there’s a lot of hype that doesn’t appear as usable functionality. 5. Always buffer the data to keep crackers at bay and keep the data safe. 6. Never allow third-party vendors direct access to your backend processing—use secure intermediate components instead. 7. Never erase the local copy of the data before you verify that it has arrived on the server in good condition. 8. Never allow users to remove data from a remote location—always move the data to a holding database instead. 9. Reduce the effects of connection failure by maintaining an alternative connection strategy, such as DCOM or CORBA. 10. Increase your chances of a successful project by using a high level of modularity of project components.
281
CHAPTER
Moving to Web-Based Applications In this chapter Uses for Web-Based Applications
285
Overcoming Problems with Web-Based Applications
287
Updating a Thick Client Application for Thin Client Use Creating a Live Data Application
302
Handling Web-Based Application Errors
303
Security Issues for Web-Based Applications
307
Quick Fixes for Memory and Other Resource Problems Case Study
317
289
312
9
284
Chapter 9
Moving to Web-Based Applications
It wasn’t very long ago that everyone worked on a PC, alone in a world of their own. LANs connected the PCs together, but people were still essentially working alone. Today, people no longer work alone and it’s quite possible that they won’t use a PC to communicate. Collaborations are becoming the rule of the day, and you’ll find that they occur on everything from PCs to personal digital assistants (PDAs). In the near future, you might even use your cell phone to perform certain computer-related tasks. Tying everything together is the Internet and the Web-based application. This chapter acquaints you with the Web-based application. In the first section of this chapter, “Uses for Web-Based Applications,” we talk about how you can put these new applications to work in your organization. You might not have a requirement for PDA communication today because it’s a new technology, and only a few companies are brave enough to test new technology waters before they’re completely proven. You’ll likely need to create Web-based applications in the future, and this section helps you see their potential. The second section of this chapter, “Overcoming Problems with Web-Based Applications,” helps you understand how Web-based applications currently fall short of the ideal. What am I using for comparison? The desktop application is the ideal by which users normally judge computer software, so that’s what we’ll use in this section. Although you can overcome some of the problems using clever programming, other problems will steadfastly refuse any form of fix using current technology. This section also talks about those problems that you can repair, and those that you’ll have to live with for now. The third section, “Updating a Thick Client Application for Thin Client Use,” looks at the problem most of you’ll have. You currently have a lot of desktop code that works fine on a PC, marginally on a laptop, and not at all on a PDA. Developers commonly refer to desktop applications as thick clients. Web-based applications use the thin client approach because it places less of a burden on the client computer. This section looks at techniques you can use to move a thick client application into the world of the thin client. We begin with a thick client application and move it to a thin client environment. (We won’t look at a PDA example in this section—that example appears in Chapter 10, “Working with PDAs.”) Many applications today rely on live data, rather than static data. Unfortunately, Web pages are too often associated with static data. It’s hard to keep data live using a Web connection that might not work from one minute to the next. The “Creating a Live Data Application” section of this chapter looks at techniques that you can use to create a live data application— one where the data changes on the client as it changes on the server. It’s not always necessary to provide live data, so we also talk about the situations where live data is most appropriate. In short, the example application in this section is an alternative to working with static data. The fifth section of this chapter, “Handling Web-Based Application Errors,” shows you how to deal with problem connections and component errors. This section looks at problems that you’ll most likely encounter when working with Web-based applications. For example, losing the connection to the server is a very real possibility in this situation. We examine ways to detect the connection loss and attempt to reestablish it automatically.
Uses for Web-Based Applications
285
For the purposes of this book, the term cracker will always refer to an individual that’s breaking into a system on an unauthorized basis. This includes any form of illegal activity on the system. On the other hand, a hacker will refer to someone who performs low-level system activities, including testing system security. In some cases, you need to employ the services of a good hacker to test the security measures you have in place, or suffer the consequences of a break-in. This book will use the term hacker to refer to someone who performs these legal forms of service.
The sixth section of this chapter, “Security Issues for Web-Based Applications,” deals with a major problem today. Any time you open a server to the Internet, you risk exposure to crackers. However, placing a client on the Internet also poses subtle risks that many companies fail to take seriously. This section looks at the security risks at both ends of the connection and provides you with code to help secure your applications. The seventh section of this chapter, “Quick Fixes for Memory and Other Resource Problems,” looks at another serious Web-based application problem. Most of the client devices for Web-based applications are memory or resource restricted. Even a laptop has fewer resources, in most cases, than a desktop machine does. This section helps you understand how to circumvent the limitations of Web-based application clients. The final section of this chapter contains a case study showing how another company is using Web-based applications to get work done today. Most companies are still in the experimental stage with their Web-based applications as I write this. Eventually, these new applications will find their way into the mainstream and make it possible to interact with your company using a larger variety of devices than ever before.
Uses for Web-Based Applications For some people, moving to Web-based applications is a given because they have a large sales staff on the road or some other remote communication concern. However, for many people, the question of using Web-based applications might not have even come up yet. Some developers find themselves dealing with the problem of the moment, rather than looking at future requirements. The point is that Web-based applications are becoming available, but they aren’t exactly common today. As technologies like SOAP become more prevalent and stable, however, businesses will begin the move to the Internet in a big way. Deciding which business processes to implement as Web-based applications isn’t easy. Businesses have many concerns when it comes to moving data on the Internet, especially after news reports of cracker activity make it apparent that this is a risky move. Of course, moving anything in a company requires planning, but more importantly, you have to have a goal or a vision for the project. Otherwise, it’s pointless to begin the process. The goal is to find out which processes will work best on the Internet today, and wait for technology to provide answers needed for other application types. Some applications definitely work better than others do when modified for the Web-based application environments. Eventually you’ll be able to move any application to the Internet,
Ch
9
286
Chapter 9
Moving to Web-Based Applications
but today, current technology limits what you can do. Social concerns also come into play. The employees who will use these new applications will have to develop a new set of skills and methods of looking at data. Web-based applications typically rely on a browser interface that’s not equipped to provide a desktop application appearance. One of the easiest applications to move to the Web-based application environment is the help desk application. You can use help desk applications for more than simple or even complex help files. Some companies use help desk-like applications to provide alternative information to the user, not just help. For this reason, the help desk application is both more complex and more flexible than the help file. The following list provides you with some ideas on how you can use help desk applications to do more than just provide helpful information. ■
Company Policies: The policies used to run your company are normally stored in printed format, making them difficult to access. Using a help desk application to make this information available online is one way to reduce the time required searching for a company policy, making it more likely that employees will follow such policies.
■
Bulletin Board: Companies usually have bulletin boards containing announcements, employee regulations, and the like. In many cases, someone has to go from bulletin board to bulletin board making updates as required. A single help desk application can replace all of the company bulletin boards. Scanned images can replace paper counterparts. In addition to saving time, using an electronic bulletin board allows more freedom in the presentation of information (use of animation is just one example) and closer monitoring of bulletin board content.
■
Forms: Finding a required form can be a difficult and time-consuming process. In fact, this problem results in duplicate forms for many companies because each person is certain that the form doesn’t exist. Using a help desk application to allow employees to find forms can save time, money, and duplications.
■
Company Locator: Someone with a company telephone book has to know someone about the company before he can use it. A help desk application that provides the same functionality, on the other hand, is useful from the very first moment. Instead of knowing that James Smith is the personnel director, a user can simply ask for the personnel director and the application will present the required information.
Of course, you can use Web-based applications for more than just help desk support. Webbased applications fulfill a variety of other roles. For example, you can use them for the same utility applications that we discussed in the “An Overview of Remote Access Utilities” section of Chapter 6, “Creating Remote Access Utilities.” You’ll need to provide additional security for the Web-based versions of utility programs because it’s likely that you’ll always use them from a remote location. The Web-based version of these applications will also look and act slightly different. This could actually become an advantage because you could take advantage of technologies such as eXtensible Hypertext Markup Language (XHTML) to allow the application to run on multiple hardware platforms. Imagine the benefits of administering a network from a cellular telephone.
Overcoming Problems with Web-Based Applications
287
Some companies are creating full-fledged applications using a Web-based format. For example, some developers feel that new versions of word processors will rely on the Web-based format instead of using proprietary document formats as they do now. (See the “Determining Which Data Entry Vehicle to Use” section of Chapter 7, “Creating Data Entry Forms and Surveys,” for a discussion of StarOffice—one of the potential candidates for this transition.) Using XML as a basis for storing all data means that you no longer have to worry about data becoming inaccessible as the applications that created the data become older. Unfortunately, using XML for all data storage also means investing in larger hard drives.
Many developers will become involved in embedded systems as embedded systems start to appear in more places. You’ll currently find embedded systems in your home, car, and workplace. Some vendors have already connected embedded systems such as alarms, temperature monitoring, and car locators to remote networks, including the Internet. Java and other high-level programming languages are becoming more popular for embedded systems, making it easier for more companies to get involved. Added to this mix is a new SOAP implementation for embedded systems from eSOAP (http://www.embedding. net/eSOAP/). You’ll find an article about this product and its use at http://www. embedding.net/eSOAP/english/index_script.html?src=%22Documents%22. Your next project might not be on the PC—it could be someone’s smart toaster.
Finally, Web-based applications will eventually fill a new class of application. Information exchange has become a central focus for application developers today. Look for Web-based applications to extend this idea in the future. For example, it’s already possible to translate human speech from one language to another. A Web-based application that made that feature available on a PDA or telephone would be extremely helpful for business travelers. Likewise, Web-based applications will eventually make it possible to perform client research onsite. A salesperson could perform a credit check on a customer without ever leaving the area.
Overcoming Problems with Web-Based Applications Web-based applications present more challenges than the typical application because there are so many variables to consider. For example, companies now demand that Web-based applications work on more than one hardware platform. You might find yourself testing an application on both a desktop machine and a personal digital assistant (PDA). The differences in screen size, capability, and operating system will make such testing interesting to say the least. Browser compatibility is another issue that you’ll have to learn to circumvent. Web-based application development is hard enough when everyone uses the same browser on the same platform. The fact is that although Microsoft does enjoy a commanding lead in the browser market, it doesn’t own the entire market. In addition, you’ll find that the different versions of Internet Explorer are incompatible with each other. Even if everyone decides to use the same browser, the chances of getting everyone’s version the same are slim.
Ch
9
288
Chapter 9
Moving to Web-Based Applications
The usual host of problems is present. In fact, you’ll probably find them amplified in the Web-based application arena. If you have performance problems when working with applications on the desktop, the problems will be far worse when working with a Web-based application. Lost connections and users who fail to follow procedures won’t help matters. You can overcome most of these obstacles given time and a few resources. Unfortunately, most developers are short on both time and resources. You cannot afford the time to conduct extensive tests, hold users’ hands, and generally clear the environment of hazards. With this in mind, the following list provides some quick problem fixes you should try during your next Web-based application development session: ■
Decide in advance which platforms you’ll support. Make sure you consider the platforms used by partner organizations. You’ll also want to define specifications that match current company plans for upgrades and the installed base of products. It’s important to publish a list of new equipment requirements as part of your specifications so you don’t continue to struggle to support outdated products.
■
Test your application on all of the hardware platforms you expect to use. Performing representative testing, where you check just a subset of the platforms, won’t work. The number of SOAP toolkits on the market right now that work with just one PDA, workstation, or server prevents representative testing from working.
■
Set up a hands-on lab for users. This will allow users to gain experience with the new system before it suddenly appears on their desktop or other hardware. Make sure you include several pieces of hardware in the lab so users can see differences between desktop and PDA use. Stress that none of the data the user creates will remain after the system goes online. This helps prevent some users from monopolizing the machines to get a good start on the new product in advance. A lab also allows you to gather usability information from users before the system goes online so that you can make any required changes early in the design phase.
■
Test all connection types. More than a few companies have created a new application and tested on their LAN. The application works great when used locally, but runs too slow from a dial-up connection. Even if you don’t plan to use dial-up connections for your application, try using your application from a dial-up connection anyway. It’s important to consider all of your options when working with Web-based applications.
■
Try different coding techniques early in the design and implementation phases. SOAP toolkits normally provide several ways of getting any job done. Choosing the most convenient method won’t always ensure good results. You might find that when you use the Microsoft SOAP Toolkit, for example, you need to use the low-level API to gain a performance advantage for your application.
■
Mix communication techniques as needed. SOAP isn’t an all-or-nothing solution to coding problems. Many applications will require continued use of binary protocols,
Updating a Thick Client Application for Thin Client Use
289
such as DCOM and CORBA, to provide adequate performance. This is especially true for Web-based applications where performance is critical. ■
Diagram your application, components, and even message formats. An optimal setup includes defining all application elements at the outset of the project, and then sticking with the plan unless there’s a good reason to make a change. We’ll explore the BizTalk business-to-business (B2B) solutions in Appendix B, “Microsoft BizTalk and SOAP.” A major part of BizTalk is the utilities it provides for designing an application.
■
Consider adding automation to your application. Sometimes taking the human out of the picture can produce large increases in efficiency and keep users happy as well. The user will want to concentrate on his job, not on your application. Making tasks easy for the user usually garners positive results in application usage.
Updating a Thick Client Application for Thin Client Use Web-based application development revolves around the thin client. A thin client is one that relies on the server to perform most of the processing. The thin client displays data and collects information from the user, but doesn’t process the information. Thin clients are akin to the dumb terminals used with mainframes in days gone by. Thin clients ddo possess more intelligence, but they use the intelligence to service user needs rather than process data. Contrast this with the thick clients that many developers are used to creating. A thick client processes much information locally and passes only the results to the server. Instead of working with raw data, the server merely acts as a storage and coordination device that tracks the state of the completed information. Thick clients work well on peer-to-peer networks because each machine contributes toward the overall processing goals. However, thick clients also require many local resources, something that small devices such as PDAs and cellular telephones don’t provide. The main reason that developers are moving away from thick clients in some situations is that users are starting to use lower power devices to run applications. On one end of the question, you have thick clients and on the other, you have thin clients. However, the question isn’t black or white. Many developers see clients as a continuum from thick to thin. Some clients exist in the middle by performing some processing locally and asking the server to do the rest. The point of updating a client from thick to thin is to change the location of the processing. The same processing takes place, but the location has changed. The following sections will take you through the process of changing a thick client to a thin client. You’ll also change the viewing application from a desktop application to a browser. The same processing will take place after the change, but the way the application displays the data and the location of the data manipulation will change.
Ch
9
290
Chapter 9
Moving to Web-Based Applications
Many developers are used to creating components with properties. Using properties allows a developer to set up a component before executing commands. In this way, the developer can perform a setup once and then make calls without including a long list of arguments for each call. Unfortunately, properties won’t work with SOAP because SOAP is stateless. However, you can use the two-component approach explained in this section of the chapter to allow property usage within your SOAP application. A processing component that always resides on the server sets the properties within the server component using standard COM calls. A cookie identifies individual users so the processing component can keep the calls separate. Using this technique allows you to retain some of the benefits of using properties while working with SOAP applications.
Generating Proxy Classes the Easy Way Third-party developers are constantly looking for ways to decrease the complexity of working with SOAP. One technique is to create code generators that solve at least simple problems. You can download the VBWS Proxy Generator that creates a Visual Basic class based on the content of your WSDL file at DevXpert (http://www.devxpert.com/resources/). This site contains a wealth of other tools and helpful tutorials as well. The VBWS Proxy Generator consists of three components: a configuration file that lists all of the data types used in your application, an HTML application that provides a GUI, and an XSLT file that performs the conversion. You start the VBWS Proxy Generator by double-clicking the VBWS.HTA file. Figure 9.1 shows what you’ll see.
Figure 9.1 The VBWS Proxy Generator features an easy-to-use interface.
Updating a Thick Client Application for Thin Client Use
291
All you need to do is point the utility to the WSDL file you created for your component using the WSDL generator and supply a directory to hold the resulting Visual Basic class file. Click Generate VB Code and the program will output a class file similar to the one shown here. (This example shows the output from the AddIt component found in Chapter 4, “Using SOAP to Create a Simple Application.”) Option Explicit Private Const WSDL_URL As String = “http://winserver/soapexamples/SimpleAdd/AddIt.wsdl”
Public Function DoAdd(ByRef Add1 As Integer, ByRef Add2 As Integer) As Integer Dim soap As soapClient Set soap = New soapClient soap.mssoapinit WSDL_URL, “AddIt”, “AddItClassSoapPort” DoAdd = soap.DoAdd(Add1, Add2) End Function
All you need to do is add this class file to a component, and then use the component to access the functionality of the server-side component. We looked at a similar example in the “Updating a Simple Utility Program” section of Chapter 5, “Migrating an Application from DCOM to SOAP.” The VBWS Proxy Generator automates this process to a certain extent. Note that you still need to add error handling. In addition, the VBWS Proxy Generator might not work for complex data-handling needs. You also need to work with the configuration file to allow it to handle complex data types.
Using Thick and Thin Clients Simultaneously Some organizations will require more than one client for their applications. Users will want a desktop client (thick) for local use and a Web-based client (thin) for remote use. You can create both clients with little additional work by remembering that the difference between thick and thin is where the processing occurs. Component technology is wonderful because you can break an application into as many pieces as you need. Creating a processing component, the embodiment of the difference between thick and thin is one way to handle the problem of multiple client types. You can simply place the processing component on the user’s machine when creating a thick client. Likewise, the server will hold the processing component when working with a thin client. This still leaves a problem with communication. Remote clients will want an application that communicates reliably over the Internet. To do this, you’ll need to use a protocol such as SOAP for at least the thin client’s connection to the server. Another way to look at this problem is convenience. If you design the processing component to use SOAP as both the input and output protocol, you can move it anywhere and still communicate with the server. That’s just what we’ll do in this example. The thick client is nothing more than a shell that calls on a local processing component using SOAP. The thin client is nothing more than a shell that calls on the remote processing component using SOAP. Except for the differences in presentation, you’ll find that the applications operate about the same.
Ch
9
292
Chapter 9
Moving to Web-Based Applications
Creating the Server-Side Component The server-side component for this example isn’t that complicated. All it does is detect the name of the remote server and return it in a string. The server-side component could just as easily access a database, obtain server status information, or perform other tasks as other components in the book have done. You can use this component to obtain several common computer name values available with the GetComputerNameEx() API function, as shown in Table 9.1.
Accessibility to the GetComputerNameEx() API function didn’t ship with Visual Studio 6, so you won’t find it in the copy of Microsoft Developer Network (MSDN) that ships with that product. GetComputerNameEx() function access does come with the latest version of the Platform SDK and should come with Visual Studio 7. The source code in the \Chapter 09\Computer Name Component directory in the source code file available from the Que Web site contains a module with the necessary function declaration. You can find the source code at http://www.quepublishing.com. All other resources appear with the component source code in this section. In short, even if you don’t find the function in your current copy of MSDN, you’ll find everything needed to use it in this component.
Table 9.1
Name Value Return Types for
GetComputerNameEx()
Name Type Constant
Description
ComputerNameNetBIOS
Returns the NETBIOS name of the computer. If the computer is part of a cluster, GetComputerNameEx() returns the name of the cluster, rather than the name of the individual computer.
ComputerNameDnsHostname
Returns the DNS host name of the computer. If the computer is part of a cluster, GetComputerNameEx() returns the name of the cluster, rather than the name of the individual computer.
ComputerNameDnsDomain
Returns the DNS domain name assigned to the computer. If the computer is part of a cluster, GetComputerNameEx() returns the name of the cluster, rather than the name of the individual computer.
ComputerNameDnsFullyQualified
Returns the fully qualified DNS name that uniquely identifies the computer within the network as a whole. This normally includes the computer name and the name of the domain to which it belongs as . If the computer is part of a cluster, GetComputerNameEx() returns the name of the cluster, rather than the name of the individual computer.
ComputerNamePhysicalNetBIOS
Returns the NETBIOS name of the computer, even if the computer belongs to a cluster.
Updating a Thick Client Application for Thin Client Use
Table 9.1
293
Continued
Name Type Constant
Description
ComputerNamePhysicalDnsHostname
Returns the DNS host name of the computer, even if the computer belongs to a cluster.
ComputerNamePhysicalDnsDomain
Returns the DNS domain name assigned to the computer, even if the computer belongs to a cluster.
ComputerNamePhysicalDnsFullyQualified
Returns the fully qualified DNS name that uniquely identifies the computer within the network work as a whole. It returns the name of the local computer even if the computer belongs to a cluster.
Ch
9 The enumeration also contains a ComputerNameMax value. This value is included with the Visual C++ header file, and Microsoft documents it as part of the COMPUTER_NAME_FORMAT description in the platform SDK. However, this value is currently unused and will return a blank value.
The GetComputerName() function requires three arguments: one of the name type constants shown in Table 9.1, a buffer used to hold the return value, and the length of the buffer. The buffer length variable returns with the actual length of the name on return from the function call. GetComputerName() returns a Boolean value that’s true if the call is successful. Now that we’ve discussed the basics, let’s look at the component code. Listing 9.1 shows the component-specific code for this example. It doesn’t show the function declaration in the GetCompNameMod.bas file.
Listing 9.1
CompName Component Source Code
‘An enumeration of the possible name types. Public Enum COMPUTER_NAME_FORMAT ComputerNameNetBIOS ComputerNameDnsHostname ComputerNameDnsDomain ComputerNameDnsFullyQualified ComputerNamePhysicalNetBIOS ComputerNamePhysicalDnsHostname ComputerNamePhysicalDnsDomain ComputerNamePhysicalDnsFullyQualified ComputerNameMax End Enum
Public Function GetCompName(NameType As COMPUTER_NAME_FORMAT) As String ‘Create and initialize a computer name buffer. Dim NameBuffer As String NameBuffer = Space(80)
294
Chapter 9
Moving to Web-Based Applications
Listing 9.1
Continued
‘Create a buffer length variable. Dim BufferLength As Integer BufferLength = 80 ‘Get the computer name based on input values. If GetComputerNameEx(NameType, NameBuffer, BufferLength) Then ‘Check the Buffer Length If BufferLength > 0 Then ‘Return the computer name to the caller. GetCompName = NameBuffer Else ‘Return a no name value. GetCompName = “No Name Assigned” End If Else ‘Return a generic name. GetCompName = “No Name Available” End If End Function Public Function GetAllNames() As Variant ‘Create an array to hold the values. Dim AllNames() As String ReDim AllNames(7, 1) ‘Fill the array with values. AllNames(0, 0) = “ComputerNameDnsDomain” AllNames(0, 1) = GetCompName(ComputerNameDnsDomain) AllNames(1, 0) = “ComputerNameDnsFullyQualified” AllNames(1, 1) = GetCompName(ComputerNameDnsFullyQualified) AllNames(2, 0) = “ComputerNameDnsHostname” AllNames(2, 1) = GetCompName(ComputerNameDnsHostname) AllNames(3, 0) = “ComputerNameNetBIOS” AllNames(3, 1) = GetCompName(ComputerNameNetBIOS) AllNames(4, 0) = “ComputerNamePhysicalDnsDomain” AllNames(4, 1) = GetCompName(ComputerNamePhysicalDnsDomain) AllNames(5, 0) = “ComputerNamePhysicalDnsFullyQualified” AllNames(5, 1) = GetCompName(ComputerNamePhysicalDnsFullyQualified) AllNames(6, 0) = “ComputerNamePhysicalDnsHostname” AllNames(6, 1) = GetCompName(ComputerNamePhysicalDnsHostname) AllNames(7, 0) = “ComputerNamePhysicalNetBIOS” AllNames(7, 1) = GetCompName(ComputerNamePhysicalNetBIOS) ‘Return the result to the client. GetAllNames = AllNames End Function
Updating a Thick Client Application for Thin Client Use
295
As you can see, the component contains an enumeration for the computer name type. The enumeration makes it easier to remember which computer name type values to use. The two functions are relatively simple. The GetCompName() function does most of the work. Remember to initialize the buffer before using it with an API call. The GetCompName() function also checks for two potential error conditions and provides alternative values. The GetAllNames() function simply creates an array of values by calling GetCompName() multiple times. This reduces the number of calls that a remote client needs to make to get all of the name information for a server. You should notice two problems with this component—problems that are common with many components on the market. The GetCompName() function requires an enumerated type as input, while the GetAllNames() function uses a variant as output. Neither of these functions will work as written for a SOAP call because most SOAP toolkits can’t handle complex types of this sort. The processing component (see the next section) will handle translating these function calls into something that is easier to use. You should consider the processing component as the component that performs translation as well as provides a means to move around the processing needs of a component.
Creating the Processing Component The processing component for this example has several problems to overcome. First, it needs to overcome the problem of data typing. One way to do that is to use a third-party alternative as we did for the complex data type example in Chapter 8, “Providing Remote Database Access.” (See the “Using Complex Data Types” section of chapter 8 for details.) Another method is to convert the data type into something compatible with SOAP. Finally, we can perform a complete rewrite of the data into data the client has to parse later. All of these methods are good. They work within a limited range and you’ll find that they’re successful most of the time. The big problem is that these techniques reduce the readability of your application. You no longer have a verifiable connection to the server component because the means for calling it has changed. The component in this section not only shows server-side processing techniques, but it also shows two new techniques you can use to overcome the problems with data type conversion. Of course, these techniques are just new tools to add to your existing toolkit—they won’t replace the methods you already use. Listing 9.2 shows the processing component source code.
Listing 9.2
CompNameProc
Source Code
Public Function GetCompName(ByVal NameType As Integer) As String ‘Create an instance of the CompName component. Dim MyCompName As CompName.NameValues Set MyCompName = New CompName.NameValues ‘Create a conversion variable for the incoming integer data. Dim ConvNameType As COMPUTER_NAME_FORMAT ConvNameType = NameType
Ch
9
296
Chapter 9
Moving to Web-Based Applications
Listing 9.2
Continued
‘Return the computer name data to the client. GetCompName = MyCompName.GetCompName(ConvNameType) End Function Public Function GetAllNames() As String ‘Create an instance of the CompName component. Dim MyCompName As CompName.NameValues Set MyCompName = New CompName.NameValues Dim Result As Variant Dim Counter As Integer Dim ParseString As String
‘Contains an array of result strings. ‘Loop processing variable. ‘Output string for client.
‘Get an array of computer name values. Result = MyCompName.GetAllNames ‘Create a heading for the result string. ParseString = “Computer Name” & Space(27) & “Value” & vbCrLf & vbCrLf ‘Process the array. Make sure you account for null-terminated string ‘values as part of the processing. For Counter = 0 To 7 If InStr(Result(Counter, 1), Chr(0)) > 0 Then ParseString = ParseString & Result(Counter, 0) & _ Space(40 - Len(Result(Counter, 0))) & _ Left(Result(Counter, 1), &_ InStr(Result(Counter, 1), Chr(0)) - 1) & _ vbCrLf Else ParseString = ParseString & Result(Counter, 0) & _ Space(40 - Len(Result(Counter, 0))) & _ Result(Counter, 1) & vbCrLf End If Next ‘Return the formatted data string to the client. GetAllNames = ParseString End Function
The first function, GetCompName(), accepts an integer as input. The integer takes the place of the COMPUTER_NAME_FORMAT enumeration used by the server-side component. However, we still need to convert the integer into a COMPUTER_NAME_FORMAT variable the server-side component will understand. That’s easily accomplished, in this case, by assigning the integer input to a COMPUTER_NAME_FORMAT variable. We can’t send this value directly to the server without generating an error—you must perform the conversion separately. After the code performs the conversion, it calls GetCompName() as normal. The GetAllNames() function begins by creating a connection to the server-side component and obtaining the array containing the return values. We’ll need to parse the data into a string. The parsing process will also format the data. This differs from other components in the book. The purpose of the processing component is to provide the data to the client
Updating a Thick Client Application for Thin Client Use
297
application in ready-to-display format. You can assume that creating specific output is required in this case. The idea is to reduce the size of the client and the amount of processing it must perform. At this point, we need to take a close look at the data returned by the server-side component. You’ll encounter an odd problem when working with API calls. Microsoft wrote the API calls in C/C++, which means they use null-terminated strings. Notice the bit of code required in the two functions to overcome this problem. You need to check for the null character and eliminate it before you can pass it along for the client. This isn’t as big of a problem for GetCompName() as it is for GetAllNames(), but both functions could experience problems.
Designing the Thick Client Application Designing the thick client is going to be relatively easy because the processing component performs nearly all of the work required to prepare the data for display. Listing 9.3 shows the source code for the thick client.
Listing 9.3
Thick Client Source Code
Private Sub cmdGetAllNames_Click() ‘Create the processing object. Dim ComputerName As CompNameProc.NameValuesProc Set ComputerName = New CompNameProc.NameValuesProc ‘Get the data. txtResult.Text = ComputerName.GetAllNames End Sub Private Sub cmdGetSingleName_Click() ‘Create the processing object. Dim ComputerName As CompNameProc.NameValuesProc Set ComputerName = New CompNameProc.NameValuesProc ‘Get the data. txtResult.Text = ComputerName.GetCompName(comboName.ListIndex) End Sub Private Sub cmdQuit_Click() ‘Exit the application. End End Sub Private Sub Form_Load() ‘Set the list index value. comboName.ListIndex = 0 End Sub
As you can see, the client code consists of creating the processing object and displaying the result. This two-step process is all that the client should have to do at this point. The processing component should perform all required data formatting and handling. Figure 9.2 shows the output from this application.
Ch
9
298
Chapter 9
Moving to Web-Based Applications
Figure 9.2 The thick client application accepts the preprocessed data from the processing component.
In this case, processing still occurs on the client machine. You’ll register the processing component on the client and the processing component will access the server-side component using DCOM. In short, you’ve isolated the processing portion of the application.
Designing a Thin Client Form View Before moving to a Web page, it’s a good idea to test thin client capabilities using a form view client. This allows the developer to check the viability of the client using code he can debug easily. Remote access applications present enough challenges that you don’t want to try debugging Web page code while checking component availability. This middle step reduces the amount of debugging you’ll need to perform and increases the chances of getting the Web page right the first time. Listing 9.4 shows the code for this part of the example.
Listing 9.4
Form View Thin Client Source Code
Private Sub cmdGetAllNames_Click() ‘Create the SOAP client. Dim Client As SoapClient ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/ComputerName/CompNameProc.WSDL”, _ “CompNameProc”, _ “NameValuesProcSoapPort” ‘Get the data. Dim ParseString As String ParseString = Client.GetAllNames ParseString = Replace(ParseString, Chr(10), vbCrLf) txtResult.Text = ParseString Exit Sub ‘Display a message when an error occurs. ErrorHandler:
Updating a Thick Client Application for Thin Client Use
Listing 9.4
299
Continued
MsgBox Client.faultstring, vbExclamation End Sub Private Sub cmdGetSingleName_Click() ‘Create the SOAP client. Dim Client As SoapClient ‘Set up an error handler. On Error GoTo ErrorHandler ‘Create the connection. Set Client = New SoapClient Client.mssoapinit _ “http://winserver/soapexamples/ComputerName/CompNameProc.WSDL”, _ “CompNameProc”, _ “NameValuesProcSoapPort” ‘Get the data. txtResult.Text = Client.GetCompName(comboName.ListIndex) Exit Sub ‘Display a message when an error occurs. ErrorHandler: MsgBox Client.faultstring, vbExclamation End Sub Private Sub cmdQuit_Click() ‘Exit the application. End End Sub Private Sub Form_Load() ‘Set the list index value. comboName.ListIndex = 0 End Sub
As you can see, this code follows the same two-step process as the thick client. In this case, we establish a connection with the server, then use the required function to display the data. It’s important to remember that the XML parser will still strip the carriage returns from your data stream despite being a valid character according to the specification. Using the Replace() function allows you to restore the carriage return/linefeed combination without too many problems.
Designing the Thin Client Web Page Many of your SOAP clients will probably include Web page design at some level. This section shows how to create a basic Web page client that relies on scripts to accomplish its work. The Web page thin client uses a slightly different technique to create a SOAP client, but the process is essentially the same. Listing 9.5 shows the source code for this part of the example.
Ch
9
300
Chapter 9
Moving to Web-Based Applications
Listing 9.5
Web Page Thin Client Source Code
CompName VBScript Example Sub cmdGetSingleName_Click() ‘Create a form reference. Dim Form1 Set Form1 = Document.SampleForm1 ‘Create the SOAP client. Dim Client set Client = CreateObject(“MSSOAP.SoapClient”) ‘Get the data and display it. Client.mssoapinit “http://winserver/soapexamples/ComputerName/CompNameProc.WSDL”, _ “CompNameProc”, _ “NameValuesProcSoapPort” Form1.Results.Value = Client.GetCompName(CInt(Form1.comboName.Value)) End Sub Sub cmdGetAllNames_Click() ‘Create a form reference. Dim Form1 Set Form1 = Document.SampleForm1 ‘Create the SOAP client. Dim Client set Client = CreateObject(“MSSOAP.SoapClient”) ‘Get the data and display it. Client.mssoapinit “http://winserver/soapexamples/ComputerName/CompNameProc.WSDL”, _ “CompNameProc”, _ “NameValuesProcSoapPort” Form1.Results.Value = Client.GetAllNames End Sub Computer Name Component Test Select a single computer name if needed:
Updating a Thick Client Application for Thin Client Use
Listing 9.5
301
Continued
ComputerNameNetBIOS ComputerNameDnsHostname ComputerNameDnsDomain ComputerNameDnsFullyQualified ComputerNamePhysicalNetBIOS ComputerNamePhysicalDnsHostname ComputerNamePhysicalDnsDomain ComputerNamePhysicalDnsFullyQualified ComputerNameMax
Result values:
As you can see, the HTML page uses standard tags to create the document. The two buttons have the same names and function as the other clients. Note that each button calls a separate VBScript function that bears a remarkable resemblance to the code we’ve used for other clients. You have to create the client differently due to limitations in scripting code. After declaring the Client variable, the code uses the CreateObject() function to instantiate a SoapClient object. After you have an object to workwith, you’ll use mssoapinit() as normal to initialize the object. Make sure you include the double quotes as part of the call. Displaying the page means keeping security low because the components aren’t marked safe for scripting. You’ll see a warning message similar to the one shown in Figure 9.3 when you click either of the buttons. Click Yes and the call will proceed. Figure 9.4 shows the final application output when clicking Get All Names. Notice that the client code didn’t perform formatting, yet the output looks the same as the thick client code shown in Figure 9.2. This is one of the side benefits of using a processing component. The processing component formats the output to all clients the same, so each client maintains a consistent look.
Ch
9
302
Chapter 9
Moving to Web-Based Applications
Figure 9.3 The thin client application will display a warning message because the component isn’t marked safe for scripting.
Figure 9.4 The output of the thin client looks similar to the thick client due to the processing component output.
Creating a Live Data Application At least a few Web pages now tout dynamic content. In other words, the content of the Web page changes to match real-world events. In most cases, these Web pages rely on advanced programming techniques and client-side components to do their job. In all cases of true dynamic (or live) content, the client and server require several features as listed below: 1. The server component must generate events based on a real-world occurrence. For example, when the price of a stock changes, the server-side component monitoring that price will generate an event. 2. The client application must provide some means for accepting the event. It must listen for the server’s call. 3. The server must provide some means to detect live connections or store data for clients that have gone offline. For example, the server could use a publish/subscribe model or store the requested data in a server-side queue. The client can either opt out of the data flow or request stored messages when contacts the server again.
Handling Web-Based Application Errors
303
4. The server must provide some method to filter unwanted data. Sending every piece of information to all clients will waste bandwidth and could cause memory-limited clients such as PDAs to crash. Even when the client doesn’t crash, the user certainly won’t want to wade through reams of useless information. SOAP doesn’t provide any of these features. In addition, downloading the components to create a dynamic Web page runs counter to the reason for using SOAP. You’ll find that downloading a component using SOAP is time-consuming, resource intensive, and difficult. Therefore, your chances of creating a Web page with true dynamic content are slim. However, you can copy the techniques used by other Web authors to create content that looks dynamic, but really isn’t. The following list provides some ideas that you could pursue: ■
Use local timers to force a polling sequence by the client. Polling is generally frowned upon as resource intensive and for wasting bandwidth; however, polling at reasonable intervals is acceptable for custom applications. (You can determine a reasonable polling interval by considering the amount of data the client downloads, the processing load on the server, and the amount of available bandwidth.)
■
Use push technology to force a client update. Of course, push technology is only effective if the client allows it; many companies set their browsers to maintain strict security. Limitations in the way push technology works is one of the reasons that vendors have looked for better solutions.
■
Allow client-initiated updates. You could add a button to the Web page that would allow the client to download new content as needed. However, this is hardly automatic, and the client might push the button often out of frustration when waiting for an answer to a query.
Will SOAP eventually provide a means for creating dynamic Web pages? Nothing is in the works right now, and few developers have even asked about it. The few who have asked are in businesses that legitimately require dynamic content, such as the stock market. It’s unlikely that you’ll see true dynamic content anytime soon, so these pseudo-dynamic solutions are all that is available.
Handling Web-Based Application Errors Web-based applications suffer all of the common SOAP problems that we’ve discussed so far in this book. For example, a Web-based application is just as sensitive to changes in the WSDL file as a desktop application is. You don’t get special functionality using a Web-based application either. Nothing about a Web-based application will make it magically handle the complex data type scenarios that we discussed in Chapter 8. Developers face some additional challenges when working with Web-based applications. For example, unlike your LAN, the Internet has connectivity problems. The following sections discuss some potential application errors that you’ll need to overcome to make your Webbased application work properly.
Ch
9
304
Chapter 9
Moving to Web-Based Applications
Handling Connection Loss Loss of connection for a SOAP application isn’t the same as loss of connection for DCOM or CORBA applications. Remember that SOAP applications are one-time communications; the server doesn’t maintain state or any other information about the client after it responds to a request. That’s why you have to maintain state information outside of the application. It’s also the reason that a loss of connectivity isn’t a major concern. SOAP applications are actually more robust than DCOM or CORBA applications in this regard. Detecting loss of connection is relatively simple. The client application will receive a No message, like the one shown in Figure 9.5. The client application can detect such a message and automatically send the request again. The application should include a retry count so that it doesn’t try to send the request indefinitely. HTTP response received
Figure 9.5 A No HTTP response received
message indicates loss of connection in most cases.
Applications can lose a connection for a number of reasons. In some cases, you might want to go further than simply sending the request again. For example, when you set up an application for use while an employee is on the road, you might want to add code that checks for a dial-up connection. The client could have lost the connection due to a problem with the modem or other hardware. The point is to check beyond the application as needed when using a connection that isn’t of the highest quality.
Scripting Errors It’s likely that your Web-based application will rely on a browser—it won’t be a self-contained executable like a desktop application. This means you’ll use scripts to perform tasks, such as creating the connection and making a request. The problem with scripts is that they’re unreliable, even when fully debugged and full of error-trapping code. Think for a moment about any number of Web sites that you’ve visited. I have often fixed scripting errors on these pages by clicking Refresh and waiting for the page to reload. As a developer, this means you need to spend more time debugging your scripts and ensuring that a simple click of the Refresh button does the trick. Part of your employee training should show how to refresh a page when needed. You also need to exercise care in the selection and upkeep of browsers for your application. Using a single browser for the entire company is a good start. However, you’ll want to go further and ensure everyone has the same version of the browser with the same patches installed.
Handling Web-Based Application Errors
305
Many developers underestimate the number of browser-related problems they’ll have with an application. This truly is the weakest link of a Web-based application, and you need to do everything possible to overcome it.
Component Security Problems You’ll more than likely spend a lot of time handling security problems with a Web-based SOAP application. Depending on the toolkit that you use, the vendor might not have marked the components safe for scripting. This means that you’ll have to wrap the SOAP component in a script safe component, use the SOAP component as a lower security level, or write your own version of the client portion of the SOAP component. Of the three options, marking the component safe for scripting (assuming it’s safe for scripting) is the easiest and most efficient route to take. If you look at the examples in the Microsoft SOAP Toolkit, you’ll see another method of overcoming the scripting problem. You create multiple files, some of which reside on the server. The client never sees anything but HTML. The SOAP code executes on the server, which would seem to defeat the purpose of using SOAP in the first place. This method does work and you’ll find that you don’t need to mark components safe for scripting to use it. However, the other methods mentioned in the section perform better. They also make better use of SOAP and allow you to control the flow of information. Whether you use SOAP or straight HTML for your application, the data you transfer between client and server is at risk unless you encrypt it. The “Security Issues for WebBased Applications” section of this chapter outlines some ways you can secure your data, but these methods are all “bolted on” and won’t have the same functionality as a native solution. Eventually the vendors who are defining SOAP plan to add security.
Service Is in Use Depending on how you create your application, you might find that you see a Service is in Use error. The odd thing about this error is that it won’t have a error number, just that single sentence. It normally means that someone crashed the application on the server and the request timed out. Something in your code or your setup allowed the user to enter wrong information or perform the task incorrectly. In some cases, it’s gremlins creeping into your system, but we won’t discuss that possibility. Of course, the more users who try to gain access to the application while the server is in this state, the harder your job to fix it. The easiest and fastest way to fix this problem is to use the IISRESET (or similar reset) command. This command will have the site up and running again in about a minute. The problem with this technique is that it kicks everyone off the server, even those who aren’t using the errant application. This isn’t good for a system that has many users, especially if you have database operations in progress. You’ll notice that the server-side component in the example is an EXE, not a DLL. This means the server-side component will show up in Windows Task Manager (see Figure 9.6).
Ch
9
306
Chapter 9
Moving to Web-Based Applications
You can highlight the errant component entry and click End Process. In some cases, this will fix the problem, but you can’t count on it every time. If the error is with SOAP, the Web server, or the processing component, you’ll likely need to restart the Web server. Figure 9.6 The Windows Task Manager can cure some, but not all, of your servicerelated woes.
ActiveX Component Can’t Create Object If you have users who like to tinker with their system, you’ll eventually see the ActiveX component can’t create object error message shown in Figure 9.7. However, other users will see this problem as well, because it can happen for many reasons. The most common reasons are that the SOAP components aren’t properly registered, they became corrupted in some way, or a problem exists with the input to the components. Figure 9.7 This message appears in many situations where corruption occurred or the component is missing.
Sometimes all you need to do to fix this problem is click Refresh. An error in the data stream could prevent SOAP from instantiating the component properly. In fact, it’s a good idea to view the source coming from the server to ensure you’re getting the correct input. Registering the SOAP components again can help in a few cases. The applications on your machine play havoc with the registry, so it’s not unheard of for a component to lose its registry settings. The process takes moments, and you’ll know the components are registered.
Security Issues for Web-Based Applications
307
The next thing you’ll want to try is rebooting the machine. It could be that memory is corrupted and you need to clear it out. I have made more than a few fixes that didn’t show up until after a reboot. Finally, try reinstalling SOAP to ensure none of the components are corrupted. Windows and other operating systems do a better job of protecting components today than they did in times past, but corruption still occurs. Reinstalling the components won’t take too much time, and you’ll be sure the system is clean before you begin other troubleshooting. Sometimes it helps to use the same exact set of inputs from another machine. This check finds server configuration and security problems in some situations. If the user’s setup fails on the second machine, then you need to look at the server. Otherwise, there’s a problem with the first client.
Nothing Happens or Strange Error Message Configuration and security errors are difficult to track under the best of circumstance. Webbased applications will often test your skills because there are so many points where error can occur. For example, setting the security level of your browser too high when using unsigned components or components not marked as safe for scripting will result in no results at all. The browser will appear to ignore your input and won’t provide a reason for the problem. If you’re using Internet Explorer, you’ll see an extremely small message in the lower-left corner of the status bar that simply says Error on page. This is the only notification you receive and it disappears the second you remove your cursor from the button. One of the reasons you need several ways to test your application is to detect problems like this one. If you use the form-based thin client and receive an answer from the server, you know that the local SOAP configuration and everything on the server is fine. The problem, more often than not, is one of browser configuration. Of course, this problem could point to something like unsigned components as well. Browsers are set to detect problems that a desktop application might not consider.
Security Issues for Web-Based Applications You might feel good about your company’s security measures, but it’s important to ask if that security has a firm foundation. A recent Computer Security Institute (CSI) study conducted with the Federal Bureau of Investigation (FBI) states that computer crime losses are on the rise and security problems are to blame for many of those losses (http://www.gocsi.com/ prelea_000321.htm). Current losses topped $371 million; a 41% increase over the previous CSI report. In addition, $151 million of that loss was due to theft of proprietary information—the kind of information that a Web-based application might exchange with a server of the Internet. Actual losses are probably higher than what I’m telling you in this section because only 186 of the 538 respondents were actually willing to share financial information. More losses occurred, but there wasn’t a way to know just how much more. CSI is an outstanding resource for current security statistics and other security-related information. You can find them at http://www.gocsi.com/.
Ch
9
308
Chapter 9
Moving to Web-Based Applications
Another interesting statistic is that 91% of the companies surveyed in the CSI report detected employee abuse of Internet privileges. For example, an employee might give away sensitive information while surfing a site for entertainment purposes or download a virus from one of these entertainment sites. That particular statistic is important because it shows one area where crackers hit your Web-based applications hardest. Employee abuse of the Internet opens a door for crackers to invade your network. The cracker simply attaches a Trojan horse virus to the employee’s machine and waits for the employee to contact the company. In other words, employees can be unwitting aids to cracker activities. All of these security threats come at a time when companies are making developers more responsible for security. You’ll find that your role in ensuring applications remain safe increases as security threats increase. Network administrators no longer know enough about today’s applications to provide the level of security monitoring they did in the past. The best person to fill in the gaps is the developer. What this means to you is that you’re going to have to include more security code within your applications. It’s no longer enough to write bug-free code that detects errors. The code you write is now responsible for protecting the security of the data it manipulates. Companies will require your code to detect unauthorized access and use by employees. In short, you now share a security burden that is equal to or perhaps greater than the one born by the network administrator.
Make sure you consider security and privacy issues at the same time during the development process. Seventy companies are considering a standardized method for exchanging customer data at the time of this writing. Privacy advocates are just as certain they want to decrease the amount of customer information that companies have to exchange. The problem is so severe that government agencies such as the Federal Trade Commission (FTC) have gotten into the act. The FTC recently held a workshop to look at the methods companies use to obtain, store, and exchange customer data. If the government begins to regulate the exchange of customer data and the means used to store that data, you can be sure that security requirements will also increase. Building security into the portions of your application that affect privacy today will ensure you don’t have to rush to make government-mandated changes tomorrow. Make certain that you use flexible security and privacy code—the requirements for both security and privacy are certain to change as the Web environment changes.
Remember from previous chapters that SOAP provides nothing in the way of security. After reading the statistics in the CSI report, you might wonder how you’re going to create a secure application that will protect company data with no tools for the job. The answer is that you’ll need to secure the transport protocol and place SOAP within an envelope provided by that protocol. Secure-Hypertext Transfer Protocol (S-HTTP) provides one answer to the question; Secure Sockets Layer (SSL) provides another. By the time you read this, standards groups will provide yet other ways to secure your application.
Security Issues for Web-Based Applications
309
The following sections will explore various methods you can use to secure data transmitted by Web-based applications. We won’t look at every possible technology. I chose the technologies that I felt you would need most often and that offer the broadest base of support. Even if you don’t use the example code in this section directly or choose a different security technology for your company, the examples show various techniques that you can adapt to just about any standardized security scheme.
An Overview of the Potential Security Solutions It’s important to know the potential sources of security solutions for the problems in your organization. The Internet as a whole has to grapple with this problem. Vendors are continually devising new technologies, but crackers seem to find holes in the new strategies just as quickly. In short, security is a tug of war where the fastest party wins.
Security plays an important role in just about every business. However, security could take on an increasingly important role as governments dictate privacy laws and begin issuing security requirements for specific industries. For example, they recently issued further guidelines for the financial industry. In this case, the guideline probably equates to law because bank examiners view guidelines as suggestions that financial institutions must follow. As you plan your security strategy, look around at other industries. Even if the government isn’t telling you the minimum acceptable security limits for your business today, you might find those limitations in place tomorrow.
The following sections will look at some potential security solutions for your SOAP application. All of these methods are standards that Internet vendors currently use for Web sites and other users. Of course, the best solutions will arrive when SOAP finally has security of its own.
S/MIME Securing e-mail transfers, especially if you use them with a SOAP application, is important. E-mail is not only exposed during the time of transfer, but also while it sits on the server and then in the user’s Inbox as well. A consortium of vendors, including Microsoft, Banyan, VeriSign, ConnectSoft, QUALCOMM, Frontier Technologies, Network Computing Devices, FTP Software, Wollongong, SecureWare, and Lotus are promoting this standard. RSA Data Security, Inc. originally developed Standard MIME (S/MIME) as a method for developers to create message transfer agents (MTAs) that used compatible encryption technology. Essentially, this means that if someone sends you a message using a Lotus product, you can read it with your Banyan product. S/MIME is based on the popular Internet MIME standard (RFC1521).
You can find out about standard MIME at http://www.oac.uci.edu/indiv/ ehood/MIME/. In addition, you can find an entire list of S/MIME-specific resources http://www.rsasecurity.com/standards/smime/.
at
Ch
9
310
Chapter 9
Moving to Web-Based Applications
S/MIME represents one of the better ways to transfer data if you plan on using a one-way communication scenario. For example, using an S/MIME transfer makes sense if you plan to create a SOAP application for surveys. Unfortunately, until SOAP toolkit vendors begin to support MIME and S/MIME as part of an HTML transfer, you won’t be able to use this technology for two-way communication. It’s possible to do so, and vendors are looking at ways to implement SOAP with attachments—essentially a form of MIME as I write this.
Solutions That Require Two-Way Communication One of the problems that I keep mentioning about working with the Internet is the lack of a secure connection. When it comes to security, however, the lack of a connection could turn into a significant problem. Many security protocols in use today require a two-way connection that remains intact long enough to exchange validation information and then the data, which is a minimum of four trips from client to server and back in most cases. Security standards such as SSH, SSL, TLS, IPSec, Kerberos, Kracken, SRP, and PAK all share the same need for two-way communication. Using these standards to secure your online data ties you to transport protocols, such as HTTP, which support two-way communication. In addition, you’ll find that the client and server must exist at the same time—one of the problems people are trying to solve when using distributed applications. Developers want to work with disconnected applications that can rely on protocols such as SNMP in addition to HTTP. Although these two-way communication techniques work fine for a LAN, you’ll find them cumbersome for a Web-based application. This is one of the reasons that SOAP should have a built-in security mechanism that is more in line with its purpose.
Payload Protection Sometimes the best security solution is the one that requires the least effort to implement. Payload protection falls into the easy-to-implement category because you don’t need a lot of special software to do it. Using the MIME message technique we discussed in the “Understanding SOAP Attachments” section of Chapter 2 allows you to send a SOAP message and an attachment. The attachment could contain the sensitive data. Of course, you need to provide some type of protection for the payload. It’s important to protect the sensitive data in such a way that only those with the proper credentials can access it. The problem with most security solutions is that the user has to go through a lot of bother to use them. That’s why you still don’t see much encryption for e-mail messages. The user has to go out, get a certificate, and then go through the trouble of encrypting the message. Alternative encryption products could make payload protection both safe and easy. For example, Masker (http://www.masker.de/) allows you to encrypt data in such a way that it’s still viewable using a double-click. The only difference from a user perspective is that the program now asks for a password before Master will grant entry. Your SOAP application could use a MIME message to transfer the Masker-encrypted payload from one location to another. Masker relies on the same RC4 encryption algorithm used by encryption technologies such
Security Issues for Web-Based Applications
311
as SSL, which is reasonably safe, but not completely secure. In other words, you could trust it to transfer sensitive, but not critical data. We’ll look at the techniques for using Masker in Appendix C, “Third-Party Tool Reference.” The industry term for this type of encryption is a carrier file. An application creates the carrier file around the data file. The carrier file normally provides the required encryption. (Some types of carrier file products merely provide data hiding, not data encryption.) The application runs in the background on the client machine and provides automatic decryption services as needed. In short, carrier files provide invisible data protection. Carrier files aren’t new. Products such as the wbStego Steganography Tool (http:// wbstego.cjb.net/) have successfully hidden data files within bitmaps, text files, HTML files, and even PDF files for quite some time now. Encryption is actually an optional part of this program because it focuses on data hiding, rather than data encryption.
Using SSL One of the currently viable solutions for the security problem is SSL. Many developers find that they need to test SSL with standard Web pages before they use it with SOAP. Getting a server set up correctly seems to be a problem (as evidenced by the many SSL-related messages on the various newsgroups). To use SSL, you’ll need certificates for both client and server either issued by a known central authority such as VeriSign or generated using a certificate server such as the one found with Windows 2000. After you have the certificates installed on both machines, make sure you attach the certificates to the root authority of the machine. Otherwise, the certificates will exist, but you won’t be able to use them properly. Always make sure to test your code using a clear connection before you add security. Otherwise, simple communication problems might appear as security problems. After you debug the application, you can modify the SOAP initialization call as shown here: Client.mssoapinit _ “https://winserver/soapexamples/ComputerName/CompNameProc.WSDL”, _ “CompNameProc”, _ “NameValuesProcSoapPort”
Notice that the only change is to convert “http” to “https” in the second line. The use of SSL should be transparent to the end user and nearly so to you as the developer. If you find that SSL presents a lot of problems, make sure you have the Web server configured properly and that you install the certificates correctly. These two issues seem to create the vast majority of problems for developers. As previously mentioned, keep running tests with standard Web pages until you are certain that you have SSL running on the Web server.
Using IBM Web Services Toolkit The Web Services Toolkit (http://www.alphaworks.ibm.com/tech/webservicestoolkit) is one of the few that currently support both encryption and digital signatures. Like the Microsoft SOAP Toolkit, the Web Services Toolkit supports many of the latest SOAP innovations, including WSDL files. The Web Services Toolkit goes further, however, and
Ch
9
312
Chapter 9
Moving to Web-Based Applications
supports some vertical market standards such as Trading Partner Agreements Markup Language (tpaML). You’ll find versions of the Web Services Toolkit for Windows NT (with SP6 installed), Windows 2000, and Linux. Given that the Web Services Toolkit works on both Linux and Windows, you might find it the optimal solution for applications that have high security needs. Unlike other solutions, this one has security built in, not bolted on as an afterthought. Integrated security tends to provide better protection than protocols such as SSL. Adding encryption to the Web Services Toolkit does modify the SOAP header. The messages sent by the Web Services Toolkit have to indicate that encryption is in place so that the receiver knows to decrypt the message. The extension that the IBM solution uses might not be compatible with other solutions when they appear on the market. It is my hope that the vendors will get together and create a standardized method for adding encryption. A couple of special requirements exist when using the Web Services Toolkit. Make sure you look at the requirements list for this product. At the time of this writing, you needed a full installation of Java to work with the product. In fact, the Web Services Toolkit includes Universal Description, Discovery, and Integration for Java (UDDI4J) to make it easier to advertise services. The examples that I used also required the addition of a plug-in for the browser, which you might not always need. Note that this product runs on WebSphere Application or Apache Tomcat servers. If you’re already running one of these two servers, upgrading support for the Web Services Toolkit won’t present problems. On the other hand, if you’re running IIS or another Web server product, using the Web Services Toolkit could become problematic or even impossible. Fortunately, the Web Services Toolkit does include an embedded version of the WebSphere Application Server so that you can try it before you make a commitment to this product. Problems aside, if you absolutely have to have the very best security right now, the Web Services Toolkit is probably your best option. Make sure you purchase a compatible Web server before you create production applications because the embedded server is only useful for test scenarios.
Quick Fixes for Memory and Other Resource Problems Web-based applications are one of the newest application types for any developer. The Internet has radically changed how developers create applications and how users interact with those applications. Developers no longer tie the Web-based application to the desktop machine, although you can certainly use it in that role. A developer needs to consider more than one client type now—everything from a desktop machine, to a laptop, to a notebook, and even personal digital assistants (PDAs). It’s little wonder, then, that there’s actually more room for error in the new world of the Internet, not less.
Quick Fixes for Memory and Other Resource Problems
313
Unlike the desktop or LAN application, the world of the Internet is largely unexplored now. Yes, you can find a lot of code, but most of it won’t run with anything else because standards are few and reliable code is in short supply. Every developer who creates an application for this environment is a pioneer, and like all pioneers, he is subject to the vagaries of exploration. You might find that the perfect application you created works just fine on the company laptop machines, begins to falter on a notebook, and finally fails when it comes to PDA support. The use of multiple clients means that testing for a Web-based application is more extreme than anything created for the desktop. Internet applications also have a worldwide appeal. You can no longer count on a particular set of biases when it comes to the user interface. Not only do you have language to consider, but also every nationality has different ways of working with computer applications. In some cases, you might have to write the application for the lowest-common denominator and allow user customization to account for differences in tastes. All of these issues aside, the Web-based application is inherently different from anything you might have created for the desktop in other ways. For example, you need to consider more interfaces and data conversions. The data that you store in SQL Server no longer undergoes a single transition from the DBMS to the client application. Data undergoes several levels of change as components and services modify it to fit within the HTML specification for browsers and browser-like applications. Of course, this also means tailoring data types for SOAP. The following sections can’t help you locate and kill every potential problem that you’ll run into with Web-based applications. Consider these sections as a starting point, a list of the most common problems that you’ll run into as you work with the new Internet applications your company will demand. We’ll consider the most likely issues that every developer will run into.
Component Interactions Component interactions can take on a new meaning with Web-based applications. For example, the applications in this chapter rely on multiple components that all have to communicate correctly to send the data to the client application. These components appear on more than one machine and might not appear in the same place for all implementations. The processing component appears on the client machine when used with a thick client and the server when used with a thin client. One form of the thin client even relies on script code, with all of the problems that scripts can entail. The loss of connection can create situations where the client and server can’t communicate well. Because of the way HTML handles keep alives, the client could lose the context it establishes with the server before the client completes whatever task needs to be accomplished. The point is that you must treat every query as a new call, even if the Web server keeps the original connection alive. (This is unlikely with SOAP, but could happen in rare instances.)
Ch
9
314
Chapter 9
Moving to Web-Based Applications
Scripting also presents other problems. Creating an instance of an object using a script is seldom the same as creating an instance of the same object with a desktop application because of the scripting environment. Consequently, you might find that components won’t work at all with your script or might behave differently than you expect them to. You can get around this problem by testing a component thoroughly during development. Make sure you include scripting tests as part of your test suite. It’s also important to make sure the component behaves the same no matter what type of desktop application or script is calling it. Developers design many components with callbacks in mind. The client makes an initial query, and then the component uses a callback to obtain additional information from the client. The callback actually issues a request to a special section of the client code to obtain the additional information or data processing. Because Web-based applications are unreliable when it comes to communication, it’s important to design the component to rely on the client’s initial query without callbacks.
Benefits of the Script Debugger Scripts are a requirement when working with Web-based applications. In fact, you’ll probably need a combination of client-side and server-side scripts to make the application fully functional. Client-side scripts normally don’t present much of a problem when it comes to error handling. Either the script works or it doesn’t. Although the error code the user gets might seem somewhat cryptic, it’s usually easy to locate the problem and fix it because client-side scripts are small and usually limited in scope. Server-side scripts present other kinds of problems. Most scripting languages don’t provide anything in the way of error handling. In addition, passing the error information along to the client might be difficult without a lot of additional coding. Finally, unlike standard desktop applications or components, scripting languages provide little access to the event log, denying the developer of even this potential method of recording errors. In summary, the developer has to be exceptionally careful when creating the server-side script. Providing range and other non-error producing checks will help reduce problems. If a range or other easily detectable problem is encountered, the script can always provide client feedback in the form of a special Web page. Fortunately, Windows 2000 provides rudimentary script debugging. You need to install this feature separately using the Windows Component Wizard accessed using the Add/Remove Programs applet in the Control Panel. Figure 9.8 shows the required entry. After you add this feature to your development platform, you’ll always be able to debug scripts. You need to install this feature on both the client and the server; the Script Debugger doesn’t provide remote debugging capability. If you have full Visual InterDev support installed on the client and the correct debugging features installed on the server, you can perform remote debugging. When a script error occurs on either the client or server, you’ll see a dialog box asking if you want to start debugging the application. A yes answer will start Visual InterDev and allow you to perform full debugging of the application. This support is supposed to be even better with Visual Studio .NET.
Quick Fixes for Memory and Other Resource Problems
315
Figure 9.8 Windows provides a script debugger you can use to locate problems with your script code.
Ch
9 Developers who rely on IIS will want to configure it to provide debugging support. Rightclick the Web site in question and select Properties. You’ll see a Web Site Properties dialog box. Select the Home Directory tab and click Configuration. You’ll see an Application Configuration dialog box. Select the App Debugging tab and you’ll see a dialog box like the one shown in Figure 9.9. This is where you’ll choose the type of remote debugging that IIS will support. Other Web servers are likely to support debugging in other ways—check your vendor documentation for details. Figure 9.9 Verify that IIS is configured to provide full debugging support.
316
Chapter 9
Moving to Web-Based Applications
IIS also allows you to set debugging at the directory level. This is actually a safer option for Web servers exposed for the Internet. You can set most directories to disallow debugging and add it to those few directories that you’re using for application development. The location of the Configuration button differs by object type. For example, you’ll find it on the Virtual Directory tab of a virtual directory.
Human Language Support One of the biggest problems you’ll run into when working with a Web-based application is the issue of human language support. The Internet isn’t a closed environment. If you plan to make a Web-based application open to the public through the Internet, you’ll need to provide support for more than one language. In many cases, the problem isn’t one of translating the content of your Web site. There are many well-understood methods for translating text from one language to another. The main problem is one of application usage. For example, the captions on your application will need to change to support the other languages. You could place the required information in a database that the application could download as needed. The most recently used languages could be stored in a local database or even the registry. You need to consider broader issues, however. For example, the layout of your application must be flexible enough to accommodate the way that people normally work or users will complain that the application is difficult to use. This means allowing users to configure the display. Although this isn’t a problem with most desktop applications, it’s a little more difficult with a Web-based application because you don’t have control over the position of the various components. In most cases, the best that you can do is store the user preferences in a cookie, then use the contents of the cookie to configure the display as data downloads from the server.
ASP and Component Communication One of the ultimate problems with scripting is that you lose the direct connection between the client and the server. The SOAP application might generate an error, but unless you design the script that formats the Web page to pass that error along, the user might never see it. Because a Web-based application normally has a direct connection (you wouldn’t create a disconnected application in many cases), it’s important to pass any component errors to the client. Unfortunately, simply passing the error along isn’t enough in most cases. The problem is that the user won’t know where the problem occurred unless you provide additional information. In most cases, the user will assume that an error has occurred locally, unless you make it clear that the problem occurred within the component. Fortunately, SOAP helps in this regard by making at least its error messages server oriented. Unfortunately, the SOAP messages are still ambiguous at times.
Case Study
317
Of course, complex applications might have several layers of component calls. Therefore, it might be tempting to add code so the user sees the precise location of the problem. Rather than build overly complex components that do little for the user, it’s better to simply report the error as a component error, then tell the user to contact the network administrator about the problem. An event log entry will allow the network administrator to determine the precise cause of the problem and take steps to fix it.
Component Doesn’t Support Locale Error Increasingly, desktop applications are required to support multiple languages, but it’s still possible to write a custom application and not worry about the locale. You can’t say the same of a Web-based application. By its very nature, a Web-based application requires multiple language support. Of course, this presents a problem for the developer because he might not know which locales an off-the-shelf component will support, if it supports more than one language at all. Visual Basic applications, including components, will generate a 477 error code when a component called by the application doesn’t support the current local setting. When you detect this error in your Web-based application, you’ll normally have three courses of action. First, you can ask the user if some other language will work, in which case you’ll have to try the call again with the new locale. Second, you can try to get around the problem by attempting to access the component with another dialect of the same language. For example, several locales support French. The resulting text might not be perfect, but it should be readable. Third, you can register a failure with the user and present any data you were able to retrieve before the error occurred. The 477 error code doesn’t always occur when a component lacks support for a particular language. The LCID table has had several versions over the years, therefore the one used by your component might be out of date. Getting a newer version of the component will usually help. If that isn’t possible, you might need to find an alternative LCID that the component does support. Again, this isn’t a perfect solution, but it might be the only choice in some cases.
Case Study Web-based application development may seem like a pipe dream that will never come to fruition for some people. However, some companies have already seen results from using SOAP to re-create their desktop applications as Web-based applications. In fact, the move from the desktop has re-created some companies to the point the corporate structure is no longer the same. Consider the work done by Etelos.com. I had the pleasure of talking with Ahmad Baitalmal about their new application, Referral Survey.
An Overview of the Problem Etelos.com realized early during a market downturn that they were in the same predicament as many other companies. They had a great Web site with some loyal customers, but the Web site wouldn’t generate enough income to keep the company afloat. Etelos.com provides
Ch
9
318
Chapter 9
Moving to Web-Based Applications
a referral service named Referral Survey that sales people of other companies use to become more productive and eventually more profitable as well. Referral Survey is a special type of referral service. Some organizations have a large customer base that is hard to track. In many cases, they don’t take full advantage of their customer base as a resource for new business. Prompting each customer to provide comments and a referral generates a lot leads. These are qualified leads because customers who know what services the company provides generate them. Referral Survey allows companies to automate the process of generating the customer requests and collecting the comments and sales leads. The upper management of the client companies saw the value of Referral Survey. Salespeople with better tools who could work more efficiently are generally more profitable for the company as well. However, the sales people were resistant to using the new tool. The Etelos.com staff tried to train them and make them more aware of the potential of Referral Survey, but the results were disappointing. The client companies didn’t realize the benefits they had originally sought. The problem is one of, “we don’t do things that way here.” The sales staff didn’t want to interact with previous customers directly—they wanted to maintain the old way of working with them. It’s a recurring problem in many companies. Education and making people aware of new tools that can help them only goes so far. Of course, it’s easy to blame the people in this situation, but there are other factors to consider. For example, the new tools required a conscious effort to use and weren’t as transparent as they could be. The sales force had to work too hard to make the new tools work.
The Solution The Etelos.com programming staff looked at the whole referral system provided by Referral Survey. They looked for ways to automate every important feature of their product. Once they found some areas they could automate, Etelos.com went back to the clients with their solution. The new system overcame many of the negatives that the client sales people complained about, yet kept all of the positive features the tool could provide. In its current state, Etelos.com looks at the sales process at an organization and identifies where an “automated” survey will enhance customer relations. The client’s system calls a method on Etelos.com’s system to send a survey, collect comments, and prompt for a referral. There are no sales people to train on using this tool, which reduces costs because only a single administrator has to know about the Etelos.com system and configure it. In short, the new system completely automates a task so that all the client’s sales staff needs to do is make calls to potential customers.
Case Study
319
Now that everyone was on board, the Etelos.com programming staff faced an even more difficult task. All of the automation they added to the application meant creating better integration with the client systems. Unfortunately, every client had a different system and the Etelos.com programming staff faced the prospect of creating unique interfaces for each client.
The SOAP Connection The Etelos.com staff decided to use SOAP to create a Web service interface for their application. Since their system already relied on Microsoft technology, making the transition using the Microsoft SOAP toolkit was relatively painless. Within a month, Etelos.com created a solution to expose their existing services. They had planned to spend three months on the project using DCOM, so the time they saved was a nice plus of using SOAP. Not only did they save time, but also the programming staff feels they have far fewer headaches implementing the SOAP solution. The changes the Etelos.com staff made delighted clients that already knew about XML and SOAP. The changes made it easier for the client to interact with the server. In addition, changes to the client’s hardware wouldn’t affect their ability to work with Etelos.com because SOAP effectively isolated them from the changes. Even the configuration interface uses SOAP. Using ActiveX to build the Administration UI and SOAP to communicate with the Etelos.com service allows the client to configure their system directly. This makes life easier for the client’s administrator who no longer needs to call Etelos.com to make changes, and reduces support costs for the Etelos.com who no longer has to staff as many people to make configuration changes.
A Negative to Consider One of the problems that the Etelos.com staff has with SOAP is that it doesn’t generate events. This means that a client must poll for new referrals, rather than get them as soon as they become available. Polling wastes valuable network bandwidth, so using events would have made the system more efficient as well. Etelos.com considers this a minor problem because of the other benefits they received.
Ch
9
CHAPTER
Working with PDAs In this chapter Special Needs for PDAs 323 Getting SOAP for Your PDA
326
Updating the Complex Type Example 329 Updating the Computer Name Example 329 Addressing PDA Display Issues 338 Beyond PDAs to Telephones 341 Understanding PDA Security Issues 341 Troubleshooting 343
10
322
Chapter 10
Working with PDAs
The world is becoming more mobile all the time. It’s nothing to see someone talking over a cell phone today, and just a few years ago, it was a novelty. When I first saw a Personal Digital Assistant (PDA) in 1998, I thought they might be a passing fad or a device of limited use. Today that vision has changed significantly. Developers create applications for PDAs now that many people would have thought impossible even a year ago. SOAP is an excellent protocol for PDAs because the people who use them spend most of their time on the road. PDAs always perform remote communications with the company network because vendors don’t design them to communicate in any other way. In short, if you own a PDA, you need good remote communications of the sort that SOAP can provide. Obviously, PDAs have special programming requirements. They don’t have a hard drive in the normal sense of the word, lack a high-speed processor, and even memory is at a premium. The first section of this chapter, “Special Needs for PDAs,” examines the special programming required for PDAs. You’ll find that you need to jump over some high hurdles to make some types of applications work on a PDA, even with the help of SOAP. At the time of this writing, Microsoft doesn’t provide a toolkit that supports SOAP on a PDA, not even for Windows CE machines. As a result, we’ll examine several other SOAP toolkit choices in the “Getting SOAP for Your PDA” section of this chapter. We’ll look at choices for several popular PDA operating systems. The third (“Updating the Complex Type Example”) and fourth (“Updating the Computer Name Example”) sections of this chapter will show how to create SOAP applications for a PDA. Both of these examples rely on components that we’ve already worked with in the book. We’ll create examples for the PDA that require no server-side coding changes at all. That’s one of the benefits of working with SOAP—cross-platform compatibility. Consider these examples as the proof-of-concept sections of the book. Certainly, developing examples that will work on both a PDA and a desktop with equal functionality is a challenge that other technologies would be hard-pressed to match. The fifth section of the chapter, “Addressing PDA Display Issues,” will discuss some of the problems that you’ll encounter with PDA displays. The most obvious problem is the lack of screen real estate to display anything. Developers who work in the world of PDAs will quickly find they need to provide ingenious solutions to the display problem. You won’t work with PDAs forever. Cell phones are certain to follow as a device that developers will learn to hate as they create new distributed applications for even smaller devices. I won’t provide you with an actual example in the “Beyond PDAs to Telephones” section of the chapter, mainly because SOAP toolkits aren’t available for this platform yet. We’ll discuss many of the problems you’ll face as you move to this new platform. The seventh section of the chapter, “Understanding PDA Security Issues,” discusses security problems that are specific to PDA development. These security problems are in addition to those you’ll normally find with any SOAP application. It’s important to understand that vendors are working on these problems even as I write this chapter, but the solutions will likely take time to fix.
Special Needs for PDAs
323
Finally, we’ll look at some troubleshooting techniques for PDA applications based on the content of the entire chapter. You’ll learn some of the ins and outs of working with a PDA in a development environment. This section will tie everything together in a neat package for you as you begin writing PDA applications of your own.
PDA development is moving so fast that it’s hard for anyone to keep up. Figuring out where PDAs will go in the future is even harder. Fortunately, someone is already performing the research required to show you current trends in PDA development. Learn more about the future of PDAs at http://researchportal.com/. The Research Port site does contain information about many different aspects of computers today, but you’ll find that they cover mobile computing topics especially well.
Special Needs for PDAs There’s no free lunch—I often wish I had been the first one to say that because it’s so true. Companies that want to gain the advantages of using PDAs also have to decide how to handle the special needs of these devices. A PDA isn’t a single-purpose device like a radio, but it isn’t a full-fledge computer either; it’s somewhere in-between. After spending some time working with several PDAs for this chapter, it’s apparent that SOAP is even less ready for prime time when it comes to these devices than it is for the desktop. (The test machines included a Casio Cassiopeia and a Palm VII.) This makes sense because software of this sort normally appears on the desktop first. However, it also means that you need to consider your PDA development plans carefully because of the many pitfalls. In all cases, you’ll want to build a desktop version of your application before you attempt to create one for your favorite PDA. Fortunately, you can create a SOAP application for your favorite PDA; it just takes a little more planning. The following sections examine the special needs of PDAs. It’s important to note that most of these special needs are in addition to what you must consider for a desktop application.
The Case for PDAs Developing SOAP applications for PDAs will require considerable work. Many developers are unused to working with devices that have small screens; the memory limitations are problematic at best, and there’s a limit to the number of available development tools. However, the need for PDA development is strong. Consider the case of Sears. They recently purchased 15,000 Palm PDAs for their business. The deal is worth between $20 million and $25 million. That’s a lot of PDA power for their staff. Each PDA is equipped with a built-in bar-code scanner and a wireless modem. The company plans to use these new devices for inventory management, price changes, and merchandise pickups. Some developer has a large programming task in the works as I write this. You can bet that such a serious investment comes with an equally serious need to affect the bottom line. In short, PDAs are becoming mainline systems for many different tasks.
Ch
10
324
Chapter 10
Working with PDAs
Companies often cite two main reasons for switching to PDAs after using other devices. The first reason is that PDAs cost less to buy, operate, and maintain than many other devices. A PDA equipped with the right add-on devices can perform a myriad of tasks in an intelligent manner. The second main reason is ease of use. PDAs have a limited number of buttons on them and the functions of each button are easy to understand. The user writes on the screen as he would using pen and paper—the PDA uses handwriting recognition to convert the handwritten information into text. Sears might be looking toward the future as well. A customer could come into the store with a PDA, beam the information to a sale clerk’s PDA, and get his merchandise faster than ever before. Of course, this use is in the future; most shoppers today don’t place their orders in a PDA. The point is that Sears and other companies like K-Mart are already planning for this eventuality. Unlike older, single-function devices, PDAs are completely programmable. This means an investment in hardware today won’t become an albatross tomorrow. Companies can extend the life of an investment by using the same PDA in more than one way. As the PDAs age, they’ll handle applications with lower programming requirements.
Special Add-ons Most vendors design PDAs as electronic versions of the calendar, address book, and personal note taker. Early versions of these products didn’t include the mini-word processors and spreadsheets you’ll find in modern versions. In fact, today you can extend many PDAs to double as cameras, scanners, and other devices with special add-ons. The PDA isn’t exactly a standard device. There are many hardware implementations, more than a few operating systems, and even different capabilities to consider. When users start adding features to their PDA, you might find that it’s nearly impossible to determine which features you can rely on finding. In short, standardization within the company is essential, even if there’s chaos outside. These special add-ons can also work to your advantage. Imagine creating an application to work with one of the camera attachments for a PDA. Each picture is automatically transferred to a remote processing center as the photographer takes pictures. SOAP could make this task relatively easy and automatic. The pictures would be ready for viewing by the time the photographer reaches the home office. In summary, special PDA add-ons present problems because they create a non-standard programming environment. On the other hand, these add-ons can create new productivity situations where a developer can provide functionality that no one has ever seen before. The optimum setup is to standardize features required to make SOAP work, such as the type of network interface card (NIC). On the other hand, it’s important to consider specialization. Adding a camera to a PDA turns it into a direct image transfer device. PDAs provide the means to extend what computers can do as long as you configure them with application connectivity in mind.
Special Needs for PDAs
325
Networking SOAP relies on a connection between the client and the server. It’s easy to create a connection when you’re working with a desktop machine. If you can’t create a direct connection using a LAN, there are always alternatives such as using dial-up support or a wireless network connection. SOAP makes communications between a server and desktop machine easy because of the many ways to create the connection. PDAs might not offer much in the way of network connections. In many cases, the PDA vendor will offer a method to synchronize the PDA with the desktop. The synchronization process works with static data and won’t provide you with a live connection to the network. It helps to have a wireless network setup when working with a PDA because many vendors design PDAs to use this connection type. I was able to find a third-party product for connecting a Pocket PC directly to the network using a special NIC. Every PDA that I looked at does provide some type of modem support, but the modem is usually an add-on and doesn’t come with the device. Again, you’ll need to standardize the kind of modem support you need for your application. You’ll also want to be sure that you can create the desired connection and view material using a browser if you decide to go that route. The bottom line is that the original purpose of a PDA conflicts with the ways that some people use them today. Vendors designed the PDA to provide electronic versions of calendars and address books. Yes, you can run SOAP on a PDA, but only if you have the required live network connection. Obtaining that connection can prove difficult to say the least. Vendors seem to know that this is an issue and at least some of them are working on ways to overcome these problems.
Operating System You probably know about the major PDA operating systems on the market today. Palm is one of the most common PDAs, and it has its own special operating system. Windows CE (also known as the Pocket PC) is another favorite because it looks and acts much like Windows for the desktop. What you might not realize is that the operating system the PDA package says it uses isn’t the operating system you’ll get. Small variations between machines make the difference between a SOAP toolkit that works and one that won’t even install. The operating system also determines what type of client you can run. Palm systems offer ease of use and low cost. However, they also offer fewer features and an operating system that is less capable than Windows CE. If you plan to target the Palm, then you should probably target Java as your programming language and plan to reduce the number of advanced application features. Windows CE allows multiple levels of development. The operating system provides a subset of the features found in the Windows API. This means that you can use the programming techniques you learned in the past if you’re already a Windows developer. There’s even a special toolkit for working with Windows CE devices (http://www.microsoft.com/ catalog/display.asp?subid=22&site=763&pg=1). This toolkit includes a Windows CE emulator that allows you to test your application without a Windows CE device, along with add-ons for Visual Basic that make development easier.
Ch
10
326
Chapter 10
Working with PDAs
Because Windows CE also contains Internet Explorer, you can interact with it using a browser application. Be warned, though, that the version of Internet Explorer that ships with Windows CE doesn’t include full scripting support. You can use JScript (Microsoft’s form of JavaScript), but not VBScript (see “Internet Programming with Windows CE” http://www.microsoft.com/mind/0599/webce/webce.htm for details). Windows CE will simply ignore script commands it doesn’t understand in the Hypertext Markup Language (HTML) file. In addition to JScript, you also have access to some Internet Explorer functions such as Alert(), and you can use standard HTML tags. The Windows CE version of Internet Explorer will also work with Java applets. This means that you can create complex browser applications that don’t rely on VBScript. As PDAs increase in complexity, the operating systems that support them will as well. This means that SOAP toolkits of the future should provide robust capabilities for PDAs and even cellular telephones. Consider the capabilities of today a mere shadow of what lies ahead.
Getting SOAP for Your PDA If you’re the lucky owner of a Windows CE machine (also known as the Pocket PC), getting a SOAP toolkit is relatively painless. Palm owners are next on the list. At least one vendor is developing a solution for this platform. None of the other PDA choices on the market has a SOAP toolkit available as of this writing. However, given the newness of this technology, you can expect other vendors to provide SOAP toolkit offerings for other platforms eventually. The SOAP::Lite site (http://www.soaplite.com/) contains a section of SOAP toolkit links you can check periodically for new additions. This list tells which toolkits will work with PDAs.
According to a recent Research Portal study, developers are most likely to favor handheld devices that use the Microsoft (32%) or Palm (27%) operating system. However, if you consider that these top two operating system choices only control about 59% of the market, it’s plain that developing for any given platform could be risky. There are two ways to get around this problem. First, ensure your company standardizes on a single handheld device operating system if possible. Second, make sure you write most of the processing code for a PDA application to run on the server, rather than the client. This allows you to write for multiple PDA operating systems with less effort.
Despite the long list of SOAP toolkits you find on the SOAP::Lite site, most aren’t ready for prime time. The vast majority are in beta or not in a released state at all. The choices of usable toolkits for SOAP are extremely limited now, but you should see more choices as SOAP becomes entrenched within the corporate environment. The bottom line is that you not only need to find a SOAP toolkit for your PDA, but you need to find one that’s fully functional.
Getting SOAP for Your PDA
It’s impossible to know at this point just how many SOAP-related specifications will eventually appear on the horizon. One of the best places to learn about new specifications is the XML Web Service Specifications page at GotDotNet (http://www.gotdotnet.com/ team/xml_wsspecs/default.aspx). Three new specifications recently appeared on the standards groups agenda, including SOAP Routing Protocol (SOAP-RP), Direct Internet Message Encapsulation (DIME), and XLANG. Each of these three specifications is so new that not much is written about them yet. However, Microsoft uses XLANG (http://www. gotdotnet.com/team/xml_wsspecs/xlang-c/default.htm) with their BizTalk Server product (see Appendix B, “Microsoft Biztalk and SOAP”). XLANG will allow developers to model business processes using a standardized syntax. SOAP-RP (http:// www.gotdotnet.com/team/xml_wsspecs/soap-rp/ default.html) makes it easier to move data using SOAP over transports such as TCP, UDP, and HTTP in one-way, request/response, and peer-to-peer scenarios. DIME (http://www.gotdotnet.com/ team/xml_wsspecs/ dime/default.htm) is used to package data using a binary format in a form called payloads. Vendors are now trying to make all of these variations on a theme fit within a framework (http://www.w3.org/2001/03/WSWS-popa/ paper51). The idea of a framework is to show how the pieces fit together into a cohesive whole. You can monitor progress on the framework as well as other XML projects at http://www.w3.org/2001/04/wsws-proceedings/ibm-ms-framework/. You’ll also want to check the discussion groups for SOAP-RP (http://discuss.develop. com/ soap-rp.html), DIME (http://discuss.develop.com/dime.html), and XLANG (http://discuss.develop.com/xlang.html).
The following paragraphs provide a quick overview of some of the better SOAP toolkit choices for PDAs today. This list isn’t exhaustive or even partially complete. I chose to concentrate on those toolkits that are close enough to completion that you can use them for development today.
pocketSOAP This is the best choice if you own a Pocket PC. The same developer, Simon Fell, produces Simon’s Soap Server Services for COM (4S4C) (see Appendix C, “Third-Party Tool Reference,” for details) and pocketSOAP (http://www.pocketsoap.com/). We used the desktop version of pocketSOAP in Chapter 8, “Providing Remote Database Access.” The Pocket PC version is just as easy to install and works just as well within the confines of the Pocket PC’s feature set. In some cases, you can move code directly from your desktop to the Windows CE machine with few changes. This is especially true if you use scripts within a Web page as we will for the examples in this chapter. Although pocketSOAP is still in beta, you’ll find that most applications work with few problems. The only major problem that I experienced during testing was an occasional HTTP timeout error. The developer has promised to keep working on the kinks, so you’ll likely find this product an optimum choice for PDA development. The only caveat when using pocketSOAP is that you need to create the message by hand. It doesn’t support a high-level API like the Microsoft SOAP Toolkit. However, this actually turned out to be beneficial when working with more than one platform, as we’ll see in the examples. The bottom line is that you need to be prepared to spend a little more time coding when working with pocketSOAP, but the outcome is well worth the effort.
327
Ch
10
328
Chapter 10
Working with PDAs
IdooXoap IdooXoap (http://www.idoox.com/idooxoap.html) is a fully developed desktop product. It currently comes in Java and C++ flavors that many developers will find useful for both Webbased and desktop applications. The vendor is currently working on a version of IdooXoap for the Pocket PC. Unfortunately, the PDA product wasn’t ready at the time of writing. One of the advantages of IdooXoap is that it provides cross-platform support. The server part of the product runs on Linux or Unix. Eventually it will run on Windows without modification as well. (There’s a downloadable ISAPI package for IdooXoap on the Web site, but you need to put the package together before you can use it.) The client runs on Linux, Unix, and all flavors of Windows (except Windows CE).
IdooXoap provides the best support for multi-part MIME messages. The support is only available on Linux and Unix machines now, but might become available on other platforms. Compatibility issues flaw the MIME support, but the vendor might have them fixed by the time you read this. You shouldn’t count on MIME support for PDAs anytime soon because of resource limitations. However, given the progress this particular vendor has made, the possibility of MIME support on a PDA does exist.
A major goal of IdooXoap is to provide the same programming experience no matter which platform you’re using. Like pocketSOAP, this means that you should be able to move at least some of your code between platforms. However, the Java version is more likely to provide seamless support. IdooXoap also provides special support for the Tomcat server. You can download versions with or without Enterprise Java Beans (EJB). It was interesting to not that several of the specialty downloads include notes about daily product builds—an indicator of the volatility of the SOAP toolkit market.
kSOAP Most of the SOAP development efforts for the PDA currently center on the Pocket PC. Many developers say that the Pocket PC provides more resources for them to use in creating SOAP applications and that the Palm is somewhat limited. Given the installed base for the Palm, however, it’s only a matter of time before developers will create SOAP clients for it as well. One of the best examples of a SOAP client for the Palm is kSOAP (http://ksoap.enhydra.org/ software/downloads/index.html). This Java implementation requires you to download kXML (http://kxml.enhydra.org/software/downloads/index.html). Both products are in beta now and you might find it difficult to create robust applications using them. However, both will work with the Palm (at least the Palm VIIx I used for this book). Another potential SOAP toolkit for the Palm is Bubbles (http://www.soaprpc.com/ software/bubbles/). Unfortunately, the site doesn’t contain downloadable code as of this writing, but you might find some code by the time you read this. Like kSOAP, Bubbles
Updating the Complex Type Example
329
relies on another product, KVM (http://java.sun.com/products/cldc/), to provide low-level services. You’ll need both products to use this solution.
Updating the Complex Type Example The Complex Type Example first appeared in Chapter 8. Remember that this is the example that relies on 4S4C for the server-side component and pocketSOAP for the desktop. The main reason for using third-party products is that the Microsoft SOAP Toolkit doesn’t provide the required support. In this case, we’re defining a complex string type that resembles a Pascal string in that it has both a string part and a length part. The following sections provide you with a detailed look at the client code and a summary of some differences you should note between desktop and PDA code. We’ll also discuss some important issues during application testing, such as sources of connectivity problems when working with a PDA. This section contains some of the surprises developers experience when they try SOAP on a PDA for the first time.
Creating the Client Code You won’t need to change the server-side component for this example, so be sure to look at the source code for it in Listing 8.2 of Chapter 8. Listing 10.1 shows the client code for this example. Note that we’re using a Web page instead of the desktop application used in Chapter 8. It’s interesting to note that you can move this application directly to a desktop machine and run it within Internet Explorer without change.
Listing 10.1
Complex Type Client Code for PDA
Simple PDA SOAP Example function Test(String, Length) { var SOAPEnv; // SOAP envelope var Transport; // SOAP transport var Param; // Parameter list var SOAPParam; // SOAP method call parameters. var RecData; // Received data holder // Create the envelope. SOAPEnv = new ActiveXObject(“pocketSOAP.Envelope”); SOAPEnv.MethodName = “GetComplexString”; SOAPEnv.URI = “http://winserver/soapexamples/ComplexType/”; // Create a parameter to place within the envelope. Param = SOAPEnv.CreateParameter(“InString”, “”, “SOAPStruct”); // Initialize the parameter. SOAPParam = new ActiveXObject(“pocketSOAP.Param”); SOAPParam.Init(“StringLength”, Length, “”);
Ch
10
330
Chapter 10
Working with PDAs
Listing 10.1
Continued
Param.Parameters.Append(SOAPParam); SOAPParam = new ActiveXObject(“pocketSOAP.Param”); SOAPParam.Init(“StringData”, String, “”); Param.Parameters.Append(SOAPParam); // Send the request and receive the data. Transport = new ActiveXObject(“pocketSOAP.HTTPTransport”); Transport.Send(“http://winserver/soapexamples/ComplexType/soap.asp”, SOAPEnv.Serialize()); RecData = Transport.Receive(); SOAPEnv.Parse(RecData); // Display the result. RecData = SOAPEnv.Parameters.Item(0); window.document.SampleForm1.Results.value = RecData.Value; } Get Complex String Input a string:
Input a length:
Result values:
Script is a case-sensitive language. This means that you have to look at capitalization errors such as using Value in place of value. Unfortunately, the IDE for Visual Studio does a poor job of pointing out such errors, so locating them can be time consuming. Fortunately, Internet Explorer and other browsers will at least provide a line number as part of an error message so you know which line of code to look at.
Updating the Complex Type Example
331
The source code for this example follows the same five-step process as the desktop client. First, you create an envelope. Second, you place data within the envelope. Third, you initialize the data. Fourth, you send and receive the data. Finally, you display the data onscreen. Notice that the HTML code for this example looks the same as HTML that you use in any browser. Many developers now rely on the eXtensible HyperText Markup Language (XHTML) (http://www.w3.org/TR/xhtml1/) to work with PDAs. Using XHTML has the advantage of keeping the screens readable and small enough for a PDA. The disadvantage is that XHTML often makes desktop displays unusable, or at least unfriendly. Make sure you choose between HTML and XHTML at the outset of your development project based on how you intend to use the Web pages later.
Differences in Implementation Compare the code in Listing 10.1 with the client code in Listing 8.3 and you’ll see many similarities. You need to recognize the limitations of a PDA, however. In this case, the code will fail if you try to next calls in the same way that they appear in Chapter 8. For example, in Chapter 8 you’ll see this nested call. SOAPEnv.parse Transport.Receive
We need to replace that call with an alternative in this chapter. Note that the next call now relies on an intermediate variable and requires two lines of code as shown here. RecData = Transport.Receive(); SOAPEnv.Parse(RecData);
Of course, the use of JScript makes a difference as well. The HTML code looks similar to C, while we used Visual Basic for the desktop example. These differences can cause problems if you don’t know what to look for. In most cases, a PDA will require extra steps to accomplish the same task that a desktop machine can. This makes sense considering the resources on a PDA are limited.
Testing the Application Figure 10.1 shows the application in action. If you account for differences between a desktop and browser application, the two applications look about the same. They work the same as well. Of course, the Web client design works with the smaller screen of a PDA. We’ll discuss display size issues in the “Addressing PDA Display Issues” section of the chapter. Now that you have two versions of the same application, you can perform tests to see how application performance differs on a PDA when compared to desktop. You’ll find that the PDA executes slower for the obvious reasons. Not only is the processor slower, but the PDA is using a Web-based application. However, you might be surprised to see the small performance degradation the application experiences. Depending on the complexity of your application, you might find that users will gain access to data on the road without losing much performance.
Ch
10
332
Chapter 10
Working with PDAs
Figure 10.1 The PDA version of the application looks and acts the same as the desktop version.
One issue that will crop up is a problem with HTTP timeouts. An application that runs fine on a desktop might suddenly experience errors on a PDA. A combination of factors contribute to this problem, including ■
PDA processing speed
■
Connection speed
■
Web-based application performance issues
■
Server biases for processing local data first
■
PDA connectivity issues
In most cases, you can combat this problem by setting the timeout value of the server higher. However, this won’t fix the problem all of the time. You might find that the PDA has a terrible connection to the Internet or LAN. The user might not understand how PDA connectivity works. In some cases, the server’s load is too high and it won’t respond quickly no matter what you do. Unlike the server that issues Web pages on a daily basis, a server that services SOAP applications will require better than average response time and processing power.
Updating the Computer Name Example The Computer Name example first appeared in Chapter 9, “Moving to Web-Based Applications.” This is the two component example that relies on a server-side component to perform the work and a parsing component to provide a SOAP-friendly interface. You’ll also remember that this example plays on one of the advantages of the Microsoft SOAP Toolkit—the ability to handle complex redirections. The 4S4C product fails when you attempt to use it with this component combination. As with the previous example, the code for the two components remains unchanged in this example. You’ll find the code for the server-side component in Listing 9.1. The processing component code appears in Listing 9.2. Make sure you test the application thoroughly using the thick (Listing 9.3) and thin clients (Listing 9.4) before you move on to the code in this chapter.
Updating the Computer Name Example
333
The following sections provide a detailed look at the client code, discuss some message traffic issues, and tell you about some important differences between a 4S4C and a Microsoft SOAP Toolkit server implementation. We’ll also discuss application testing. In this case, the example uses multiple components, which can lead to interesting debugging problems when working with a PDA.
Creating the Code The client code for this section is more complex than the previous example., First, it requires two service buttons instead of one. Second, the Microsoft SOAP Toolkit also requires slightly different input than 4S4C, so there are differences in the message formatting code as well. Listing 10.2 shows the client source code for this example.
Listing 10.2
Computer Name Client Code for PDA
CompName JScript Example function cmdGetSingleName_Click() { var SOAPEnv; // SOAP envelope var Transport; // SOAP transport var Param; // Parameter list var SOAPParam; // SOAP method call parameters. var RecData; // Received data holder // Create the envelope. SOAPEnv = new ActiveXObject(“pocketSOAP.Envelope”); SOAPEnv.MethodName = “GetCompName”; SOAPEnv.URI = “http://tempuri.org/message/”; // Create a parameter to place within the envelope. Param = SOAPEnv.CreateParameter(“NameType”, window.document.SampleForm1.comboName.value, “”); // Send the request and receive the data. Transport = new ActiveXObject(“pocketSOAP.HTTPTransport”); Transport.SOAPAction = “http://tempuri.org/action/NameValuesProc.GetCompName” Transport.Send(“http://WinServer/soapexamples/ComputerName/CompNameProc.WSDL”, SOAPEnv.Serialize()); RecData = Transport.Receive(); SOAPEnv.Parse(RecData); // Display the result. RecData = SOAPEnv.Parameters.Item(0); window.document.SampleForm1.Results.value = RecData.Value; } function cmdGetAllNames_Click() {
Ch
10
334
Chapter 10
Working with PDAs
Listing 10.2 var var var var var
Continued
SOAPEnv; Transport; Param; SOAPParam; RecData;
// // // // //
SOAP envelope SOAP transport Parameter list SOAP method call parameters. Received data holder
// Create the envelope. SOAPEnv = new ActiveXObject(“pocketSOAP.Envelope”); SOAPEnv.MethodName = “GetAllNames”; SOAPEnv.URI = “http://tempuri.org/message/”; // Create a parameter to place within the envelope. //Param = SOAPEnv.CreateParameter(“NameType”, window.document.SampleForm1.comboName.value, “”); // Send the request and receive the data. Transport = new ActiveXObject(“pocketSOAP.HTTPTransport”); Transport.SOAPAction = “http://tempuri.org/action/NameValuesProc.GetAllNames” Transport.Send(“http://WinServer/soapexamples/ComputerName/CompNameProc.WSDL”, SOAPEnv.Serialize()); RecData = Transport.Receive(); SOAPEnv.Parse(RecData); // Display the result. RecData = SOAPEnv.Parameters.Item(0); window.document.SampleForm1.Results.value = RecData.Value; } Computer Name Component Test Select a single computer name if needed: ComputerNameNetBIOS ComputerNameDnsHostname ComputerNameDnsDomain ComputerNameDnsFullyQualified ComputerNamePhysicalNetBIOS ComputerNamePhysicalDnsHostname ComputerNamePhysicalDnsDomain ComputerNamePhysicalDnsFullyQualified ComputerNameMax
Result values:
Updating the Computer Name Example
Listing 10.2
335
Continued
As with the previous example, this example uses a five-step process for each function. The HTML code is harder to optimize for a small screen, in this case, because of the amount of data we need to display. We’ll see later how you can overcome these problems without modifying the server-side component.
A Look at the Message Traffic The Trace Utility (MSSoapT) allows you to monitor message traffic between the client and server. Generally, you’ll use this tool to debug your application. However, you can also use it to analyze the way clients and servers interact. Learning how to see major differences between two implementations quickly can save you major debugging time. Figure 10.2 shows the message traffic for the Thin Client example in Chapter 9. Remember that this example relies on the Microsoft SOAP Toolkit and VBScript. Notice the message includes only a few namespace entries. In addition, the message doesn’t define data types. We’ve discussed this issue before with the Microsoft SOAP Toolkit. Figure 10.2 An example of the message traffic for the Microsoft SOAP Toolkit.
Ch
10
336
Chapter 10
Working with PDAs
Now let’s look at the pocketSOAP version of the message traffic in Figure 10.3. Notice that this message contains far more namespaces and that the message defines the data types. The pocketSOAP message provides more information and works with more servers. One area where pocketSOAP might present problems is the XML tag at the beginning of the message. Notice that this tag is missing and the server won’t know which version of XML the client needs. Figure 10.3 An example of the message traffic for pocketSOAP.
Note that although the client message has changed, the server response hasn’t. That’s because the server-side component is unchanged. You’d need to change the server-side component code to allow interoperability with other platforms that require it, such as Apache. Figures 10.2 and 10.3 demonstrate something else. The basic format of the message is the same. They have differences, but the two clients present the information in the same way. The client needs to speak the same language as the server. This means changing some message elements within your code to match the server’s implementation of SOAP. I hope that these changes will become fewer as vendors work on the SOAP specification. The importance of looking at message flow now is being able to create clients that will work with any server quickly. This means analyzing differences between the PDA client and a desktop client that already works. Because your choices of PDA SOAP toolkits are limited, you’ll want to learn to recognize problem areas in messages.
Updating the Computer Name Example
337
Server Differences Revisited Listing 10.1 and Listing 10.2 have some differences that are obviously due to distinctions in application. However, you might not have noticed the nuances of difference between the two messages that are server related. For example, look at the Complex Type output message in Figure 10.4. Notice that some of the namespace URLs differ from those in Figures 10.2 and 10.3. Figure 10.4 Server differences become more apparent when you analyze message traffic.
Ch
10
When working with the Microsoft SOAP Toolkit, you’ll always set the SOAP envelope URI to http://tempuri.org/message/. The 4S4C server always requires the base URL for an application as part of the envelope. Although both servers rely on WSDL to define the service, they do so using different techniques. Another difference you should notice is that the Microsoft SOAP Toolkit requires a SOAPAction entry. You create this entry as part of the transport, not the envelope as you might expect. If you look at Figures 10.2 and 10.3, you’ll notice that the SOAPAction entry doesn’t even show up. We looked at another utility, tcpTrace, that does show this particular entry. Although the Trace Utility provides a neater, easier-to-read display, you’ll need tcpTrace to show differences of this sort. If you don’t include the SOAPActor, the server will complain and the message will remain unprocessed.
338
Chapter 10
Working with PDAs
You might want to check the source code for other differences. For example, the transport URL is precisely the same when using pocketSOAP as it is when using the Microsoft client. However, the transport URL for pocketSOAP when using 4S4C is http://localhost/ soapexamples/ComplexType/soap.asp in this example. Notice that we didn’t include the “?WSDL” part that normally appears in the Microsoft SOAP client code. Again, this is another example of server differences; it also shows how clients interact with their server.
Testing the Application The Computer Name example doesn’t work quite the same on a PDA as it does on the desktop. The reason is simple—a PDA lacks the screen real estate to display this application fully. Part of the testing process is to ensure the screen remains close to the original (to keep training costs down) while ensuring readability. Figure 10.5 shows the output from this example if you click Get Single Name. Figure 10.5 Screen real estate makes a difference in application appearance in some PDA applications.
The application does work the same from a functionality perspective. You still use a dropdown list box to select a computer name type. The buttons still produce the same results as before. Maintaining this level of compatibility between a desktop and PDA application is essential to the success of the application. Any application that requires the user to relearn a business process is doomed to failure, no matter how nice the display looks. Fortunately, this example is reasonably small. Your testing process must include resource usage and other factors that affect PDA application performance. Complex database applications won’t run very well on a PDA because they require too many resources. Of course, you can always get around the problem by using more server-side processing and presenting the PDA with semi-finished, rather than raw results.
Addressing PDA Display Issues Most developers realize that working with a PDA is going to be a challenge before they begin their first project. The problem is that the challenge usually turns out larger than expected. Small displays with limited color capability are something that most developers consigned to the past. The old techniques that developers used to conserve screen space are suddenly appearing again.
Addressing PDA Display Issues
339
The following sections discuss several important display issues when working with a PDA. This section won’t provide an in-depth treatise on the subject, but you’ll walk away with some fresh ideas for your next SOAP application.
Screen Size Many users have 17” or 19” monitors capable of a minimum of 1280 × 1024 resolution today. Developers have taken advantage of the screen real estate to create better applications that display more data at one time. Even Microsoft uses higher resolutions as a baseline for applications—many of their application screens won’t fit on a 800 × 600 display anymore. Everything you want to do with your PDA has to fit within 320 × 200 pixels. That’s a lot smaller than the typical computer screen. In addition, some PDAs use black-and-white displays in place of color, so you can’t even use some of the modern tricks to make the display look nicer. In short, PDA screens tend to look a bit plain, and developers normally find themselves feverishly cutting their application screens down to size. No matter what you do, it’s impossible to fit 1280 × 1024 worth of application screen in a 320 × 200 space in many cases. When this happens, you’ll find that you need to make some compromises in the display. For example, even the computer name example in this chapter ran into problems displaying all of the information that it could provide. Figure 10.6 shows the results of some cutting that I performed to make the application data fit and still look reasonably nice. Figure 10.6 PDA displays are challenging to work with because they’re so small.
In this case, I indented the second line of each data entry to allow enough space for long entries. Notice that the data is still readable and the user won’t have to guess about the formatting. Of course, this is still a less than perfect solution because the data does appear on two lines. It’s important to keep every application element on a single screen if possible. Figure 10.6 does this by sacrificing application data display space. The user can scroll through the data in the Result values field without moving other screen elements around. The stylus provided with a PDA doesn’t lend itself to mouse-like movement.
Ch
10
340
Chapter 10
Working with PDAs
Make sure you consider XHTML for complex applications with many elements. XHTML allows you to display your application in segments with relative ease. Other options include using the Handheld Device Markup Language (HDML) (http://www.w3.org/TR/NOTESubmission-HDML-spec.html) or Wireless Markup Language (WML) (http://www. oasis-open.org/cover/wap-wml.html). Both of these technologies use the concept of cards and decks to break up information into easily managed pieces. Of course, the PDA you use has to provide support for these standards before you can use the tags within a document. As with XHTML, using either HDML or WML will prevent your page from appearing properly on a desktop machine.
Using Color Developers have gotten used to seeing colors on their applications. Color dresses up a drab display and makes the application more fun to use. In addition, using color presents cues to the user. For example, many users associate green with a good condition and red with something bad. In short, most applications today rely heavily on color with good reason. Depending on the PDA you use, you might not have color at all. For example, many Palm models present the world in shades of gray. Even if a PDA does provide color support akin to the Pocket PC, the developer still has to use color carefully. The problem for PDA users is that the screen is already small. If they get into an area with bright sunlight, seeing the screen might become impossible, especially if it’s filled with colors that don’t work well in such an environment. (Vendors are making advances with regard to bright sunlight. For example, the Ipaq H3650 provides good display support even in bright sunlight.) Notice that the PDA screenshots in this chapter are mainly black and white. The actual screens contain some color for the icons, but that’s about it. Because these applications don’t need color to present the information they can provide, it’s possible to rely on a black-and-white image. Using color to display icons or to convey a message is still a good idea, even in the world of the PDA. For example, a red icon could signal danger or tell the user to wait without using up screen real estate for words. Of course, you need to explain the meaning of the color changes within a manual or help file (preferably both).
Pointer Pointers Most PDA users rely on a pointer to do all of their work. Sure, a few PDAs do offer a keyboard and mouse as separate items, but most of these offerings are bulky and difficult to use. Pointer use is one of the reasons that you want to keep your application on one screen, or use multiple screens when necessary. Scrolling on a PDA screen is less than intuitive and requires some level of skill to master. SOAP applications that you move from the desktop to the PDA will require some modification for screen size in many cases. While you’re working on the screen, it might be a good time to add some pointer-friendly features as well. For example, try to make as many tasks as possible accessible with a single pointer touch. Users should be able to point to what they want and allow the PDA to complete it for them.
Understanding PDA Security Issues
341
You can also build intelligence into the application. A SOAP application normally has a direct connection to the server. You can use some of the server’s processing power to make things easier on the user. Most PDAs already include predictive logic as part of their setup. For example, as you write something, the PDA tries to guess the entire word. When it guesses the correct word, you can click on it and save some writing time. The same principle works for other activities as well. For example, a SOAP application could automatically display a data screen that the user needs most often, rather than force the user to dig through several screens to find it. Pointer-friendly programs also make tasks yes or no propositions. Again, this allows the user to accomplish the task with a single click, rather than write something down. The point is to make the PDA as efficient as possible so the user doesn’t get frustrated trying to do something easy.
Beyond PDAs to Telephones You won’t find telephones that run SOAP applications today. In fact, I was surprised to find that some vendors are already creating embedded SOAP applications. (We talked about these applications in Chapter 9.) Embedded applications reside within toasters, your car, and even your television set. As SOAP moves from the desktop to the PDA, it’s only logical that it will move to the telephone as well. SOAP is a good technology for telephones and other small devices because it provides s simple protocol. Telephones are all about communication. As more people start carrying cellular telephones, the need for connectivity applications will increase. This is an exciting time for SOAP developers because of the many different devices with which to work. The interesting thing about telephones is that we already use them for so many business and personal purposes. For example, it’s possible to check your bank statement or send correspondence to your credit card company using the telephone. All you need to know how to do, in most cases, is push buttons. Of course, speaking into the telephone or entering some numbers for an account is hardly the same thing as using an application on a desktop computer. SOAP will enable two-way communications in a way that users haven’t seen before. For example, an employee could enter order information using a telephone (assuming the orders are small and relatively simple). Will telephone SOAP applications appear overnight? Probably not. Will we use the telephone to accomplish the same things we do on the PC? Again, probably not. However, the telephone does present some interesting opportunities for SOAP developers in the future.
Understanding PDA Security Issues Many network administrators view the PDA with more than a little suspicion—for good reason. The media has painted the PDA as a device that is so open that anyone can access it at any time. Smart users keep all of their sensitive data on the company computer and just place
Ch
10
342
Chapter 10
Working with PDAs
lists of tasks on their PDA. Of course, such a view defeats the entire purpose of having a PDA in the first place. A PDA should be an extension of your workplace, not a hindrance to avoid. Many of the security issues surrounding PDAs today are a perception that they’re all wireless devices. Many PDAs use network or modem connections, not wireless connections. The PDAs that do provide wireless access tend to err on the side of safety whenever possible. Vendors realize that wireless access is both a blessing and curse for many companies. However, the wireless issue probably isn’t much of an issue for the SOAP developer. Count on many of your users to rely on modem or direct network interface card (NIC) connections. For example, the Pocket PC provides a slot that will accommodate an Ethernet card that provides a direct network connection. One of the biggest security issues for all PDA users is data storage. There are actually two threats to consider. The first is that someone could access sensitive data if he stole your PDA (or at least borrowed it for a while). Wireless or not, the person who has the PDA also has access to the data it contains. The second threat is data loss. Many PDAs lose all of their information after the main battery is exhausted. (Some PDAs provide a backup battery that retains the contents of memory until the user replaces or recharges the main battery.) Unless the user backs up on a regular basis while on the road, the data is lost. SOAP can come to the rescue, in this case, by allowing the user to store all data on a central server in a secure location. The application remains on the client machine, but the data remains secure at the main office. Of course, some level of flexibility has to be built into the security plan so that users can operate their PDA in a disconnected mode. PDA vendors will likely add some form of biometric protection to their PDAs in the future. For example, the user might have to press his thumb in a certain area in the back when starting the PDA so the PDA can check his identity. Some security issues involve limitations in the PDA itself. For example, when working with a desktop application, you can ask the user to enter his name and password. The application asks the server for verification before it requests data access. The pointer a PDA user must use to enter data hampers the security effort. It’s possible that he’ll end up entering the password more than the three times that good applications normally allow. The next thing you’ll hear is a frustrated user on the telephone asking why he can’t access his application. Security is required, but any attempt to implement security that causes user frustration is almost certain to fail. Another PDA-specific security issue to consider is one of resources. Heavy security consumes many resources. A desktop user notices some slowing of his computer when he’s working with a secure application. The slowing is due to the work of the security protocol in the background. With the limited resources of a PDA, you probably can’t implement a level of heavy security because the application would run too slowly. The PDA already runs SOAP applications slower because it uses a browser-based application over a relatively slow connection. Adding to this problem is one way to lose user confidence in your application.
Troubleshooting
PDAs do suffer from every other security problem that SOAP applications normally experience. A lack of built-in security will allow prying eyes to see the data that the user transfers to the home office (at least if someone is actually looking). Unfortunately, the list of available security add-ons for PDAs is extremely limited. Theoretically, you can use technologies such as SSL with a PDA. We’ve already discussed the limitations of some of these technologies in previous chapters, so I won’t go into them again here. Needless to say, using a PDA involves some security risk that you’ll need to overcome.
Troubleshooting This chapter has shown you how to work with PDA applications under SOAP. PDAs present some interesting challenges no matter which programming language you use to develop a client and no matter how you transfer data. Fortunately, unlike many technologies on the market, SOAP allows you to preserve the server side of your development investment. If you use a PDA such as the Pocket PC, you can also reuse some of your client code. In short, SOAP solves significant PDA development problems. The following sections will examine some of the questions developers ask about using SOAP for PDA development and hopefully provide the answers you need to add SOAP to your PDA application toolkit. Always feel free to contact me at [email protected] if you have additional questions.
Why Does the Example Seem to Run, and Then Display Nothing Onscreen? The number one problem for PDAs running applications from a Web server is security. The PDA will often run into an access problem and simply ignore it. You won’t see an error message in many cases. All that will happen is that the PDA will perform whatever local tasks it can perform and then stop when the error happens. You’ll also see this problem if you use a scripting language the PDA doesn’t understand. Again, the PDA will try to accomplish all that it can locally, and then it’ll simply stop. Remember that Windows CE doesn’t provide VBScript support, but it does provide JScript support. Other PDA setups have similar limitations, and you need to check for them during the project definition stage. Some developers ignore the small screen that a PDA provides. You might not see a result because it appears offscreen. Moving the scrollbar up and down might show an answer that you couldn’t see before. Unfortunately, users will normally complain about the lack of data long before they move the display. In short, make sure that your application fills a single screen at a time so the user doesn’t have to look for the information. In a very few cases, you’ll find that all of the client files need to appear on the PDA instead of the Web server. For example, downloading a Web page so that the PDA can access the SOAP scripts might not work. Test all of the machines your application targets to ensure you won’t run into this type of problem.
343
344
Chapter 10
Working with PDAs
Why Doesn’t My Code Run Properly on All of My PDAs? During the development process for the examples in this chapter and a few personal projects, I found that even though a PDA runs the same operating system as another PDA, you need to account for some differences. For example, PDAs use different processors, so you can’t assume anything about low-level code. Consequently, it’s important to use high-level languages if you plan to work with the code on more than one machine. You’ll also find that PDAs require driver and other software updates, just as desktop machines do from time-to-time. Make sure you download and install all patches for your PDA before you begin development. In fact, you’ll want to perform this step before you install the SOAP toolkit of your choice. This ensures that the installation routine for the SOAP toolkit makes the right choices about your system. Some SOAP implementations require support files. Make sure you install all of the required support before you attempt to run an application on the PDA. For example, the Windows CE machines in the example all required a copy of pocketSOAP before they would run the example. The Palm systems required both the SOAP toolkit and the XML parser. Palm systems require two separate installations, so leaving one part of the support mechanism out of the picture is relatively easy.
How Do I Fix Messaging Problems with the Client? Most messaging problems appear when you use more than one toolkit to accomplish a task. The server might expect something other than the normal client output. However, you can normally modify the data stream enough to accommodate the requirements of client or server. The various tracing tools allow you to compare message formats and change the standard code as needed to create a message the server will understand. Likewise, you can normally modify the server output to accommodate client needs. Another problem is live connection confusion. A PDA requires a live connection with the server, not a synchronized connection. A live connection allows data exchange between client and server, while a synchronized connection allows the host workstation to update the PDAs files. In some cases, a synchronized connection might look live, but it isn’t. PDAs always require some type of direct network connection to provide data connectivity, plus a driver that allows network configuration. In short, don’t assume all wireless connections are live simply because they tap into the network—many simply provide a better way to perform synchronization. Always double-check a failed PDA connection using a desktop client that you know works. PDA connections can fail in numerous ways that a desktop connection won’t. In at least a few cases, you’ll find that the server has stopped responding and the desktop application will provide an error message to this effect. PDAs often suppress error messages, making debugging especially difficult.
APPENDIX
A
SOAP Data Types and Data Type Conversions In this appendix Data Types Overview
346
Complex Data Types
349
Differences in Implementation Data Type Conversions
353
351
346
Appendix A
SOAP Data Types and Data Type Conversions
Any data exchange technology worth its salt has to include the concept of data types. SOAP is no exception. You’ll find that SOAP messages use a simple, but usable set of data types. Just how usable these data types are depends on how the SOAP toolkit you use implements them. As I write this, there are several different prevailing views of data typing with SOAP and these views affect how SOAP toolkit vendors implement the specification. The SOAP specification doesn’t address the issue of data typing itself, but instead defers to several XML specifications. You’ll find these XML specifications listed in the SOAP specification (http://static.userland.com/xmlRpcCom/soap/SOAPv11.htm). While you’re looking at the SOAP specification, it’s important to note two of the author names on this document. The first is Don Box, the second Dave Winer. These two individuals have contributed a great deal to the current specification. However, when you talk with them online, you find that they have very different viewpoints on the current SOAP implementation, and where they would like to see it go in the future. (Many other authors contributed to the SOAP specification, but Don Box and Dave Winer seem to be two of the more vocal contributors.)
Don Box and Dave Winer are both prolific writers, along with being great computer scientists. You can read Don’s current assessment of the state of SOAP at http://www. develop.com/dbox/postsoap.html. Compare this to Dave’s view of the current state of SOAP at http://www.xmlrpc.com/stories/storyReader$555, http:// www.xmlrpc.com/stories/storyReader$1387 and http://www.xmlrpc.com/ stories/storyReader$1433. Reading these four pieces gives you a good understanding of why things work the way they do today. On one hand, you have Don’s enterprise view of SOAP—on the other hand, you have Dave’s developer view of SOAP.
So, what does this have to do with data types? A data type is more than a simple definition of a string or a float. Knowing that data of a certain type exists isn’t enough to allow data exchange. This appendix addresses the issue of data types in a fuller sense of the term. We’ll discuss how SOAP applications exchange data and what you might need to do to ensure interoperability between SOAP implementations. Obviously, the whole issue of SOAP data types is under constant change. You’ll want to check the SOAP specification. Also check online list servers such as the one found on the Developmentor Web site (http://discuss.develop.com/) and Microsoft’s newsgroups (microsoft.public.msdn.soaptoolkit, microsoft.public.xml.soap, and microsoft.public.xml.soapsdk) on the news://news.microsoft.com news server.
Data Types Overview Many developers will rely on SOAP to transmit data in text format and perform any required data translation at the endpoints. In many cases, this is the best way to ensure interoperability, but it does entail extra work for the developer and affects the performance of the application. SOAP supports a relatively short list of simple and complex data types. You’ll find the
Data Types Overview
347
simple types defined at the SOAP encoding schema Web site (http://schemas.xmlsoap. org/soap/encoding/). Figure A.1 shows some of what you’ll see at this Web site. Table A.1 describes each of these types. It also tells you how to use them within a message. Figure A.1 You’ll reference the SOAP encoding schema as part of most messages.
App
A Table A.1
SOAP Simple Data Types
Data Type Usage base64
Non-ASCII data encoded using the RFC2045 rules (http://www.faqs.org/rfcs/ rfc2045.html). This data type enables you to transfer data using the same techniques as Multipurpose Internet Mail Extensions (MIME) without the line length limitations.
binary
Non-ASCII data encoded using standard hexadecimal methods. You normally avoid this data type because it doesn’t transfer well through firewalls.
boolean
A value of true or false represented by a 1 for true and 0 for false. Boolean representations differ among programming languages, which means you may have to convert this value to a valid language representation after the XML parser interprets it.
byte
A data type derived from short where the maximum inclusive value is set to 127 and the minimum inclusive value is set to -128.
date
A Gregorian date representation as defined by ISO8601. The date uses a format of -- (that is, 2001-03-31 for March 31, 2001). It may include an optional time zone value that helps in ordering dates.
decimal
A decimal (base 10) representation of an integer used to increase the accuracy of some types of calculation such as financial data. The sender must indicate the precision of the decimal value (normally 18-digits) and the number of fractional digits within the number.
double
An IEEE 64-bit precision floating-point number as defined by IEEE 754-1985 (http:// standards.ieee.org/reading/ieee/std_public/description/ busarch/7541985_desc.html).
348
Appendix A
SOAP Data Types and Data Type Conversions
Table A.1
Continued
Data Type
Usage
float
An IEEE 32-bit precision floating point number as defined by IEEE 754-1985.
int
A data type derived from long where the maximum inclusive value is set to 2,147,483,647 and the minimum inclusive value is set to -2,147,483,648.
integer
A data type derived from decimal where the number of fractional digits is set to 0. This differs from most language integer definitions and avoids some of the problems associated with supporting multiple platforms. Note that integer is the type used to define all other integer data types.
long
A data type derived from integer where the maximum inclusive value is set to 9,223,372,036,854,775,807 and the minimum inclusive value is set to –9,223,372,036,854,775,808.
short
A data type derived from int where the maximum inclusive value is set to 32,767 and the minimum inclusive value is set to -32,768.
string
The sequence of human readable characters normally associated with text.
time
The representation of a single instant of time in a 24-hour day. The time data type uses a format of hh:mm:ss.sss in accordance with ISO8601. This data type may include an optional time zone value that helps in ordering dates.
Table A.1 doesn’t contain data types that you’d normally use in areas other than data transfer elements. For example, it doesn’t include the uriReference type. The specification also includes support for some derivations of simple data types such as unsignedLong and negativeInteger. Some data types are actually subsets of the types listed in Table A.1 such as month, day, and year. You can find out more about these other types by looking at the encoding schema Web page. Of course, this Web page provides little in the way of explanation—it only tells you about implementation. You can read the descriptions for all of the data types that XML supports at http://www.w3.org/TR/xmlschema-2/. This specification doesn’t contain descriptions for data types unique to SOAP. It’s important to differentiate between the simple types that XML supports and those supported by SOAP. The SOAP specification is unclear on this issue, but the schema encoding Web page will clear things up for you. You’ll find that SOAP supports most XML data types. However, some simple data types receive support as derived types. For example, the XML duration data type is timeDuration and recurringDuration under SOAP. Likewise, SOAP doesn’t support the dateTime XML data type, but does support individual date and time data types. SOAP supports a binary data type that doesn’t appear in the XML specification—XML supports only hexBinary and base64Binary data types. The SOAP specification also places enumerated values within the simple type category. You use an enumeration to define a specific set of values for a user-defined data type based on the simple data types found in Table A.1. Here’s an example of an enumeration.
Complex Data Types
349
Forward 70
In this case, a car is going forward at a speed of 70. The enumeration limits the number of valid values to forward, backward, left, and right. This means that the car couldn’t go up, for example. The use of an enumeration helps document the SOAP message. It also provides a form of validation for the message. However, you need to use enumerations with care because they can also limit the flexibility and extensibility of a schema that you create.
Complex Data Types Sometimes it helps to group data together in meaningful arrangements. Complex data types contain arrangements of simple data types. However, the declaration of a complex type is similar to a simple type as shown in Figure A.2. The key declarations tell you how SOAP will use various keywords such as type. Note that the declaration of complex and simple types is mutually exclusive as defined by the tag. SOAP supports two complex (or compound) data types: array and structure.
The SOAP specification defines many ways to use the two complex data types that it supports. However, before you can actually use complex data types, your SOAP toolkit needs to support them. Support for complex data types is getting better, but some SOAP toolkits don’t support them yet. Even if your SOAP toolkit supports complex data types, there isn’t any guarantee that the implementation will fit in with the implementation used by other vendors. The use of complex data types represents one of the major reasons for interoperability problems in applications right now. In short, use complex data types with care. Exploit complex data types only if you plan to use the same SOAP toolkit for the entire application.
Structures normally pass a single instance of related data from one point to another. For example, you could view a single database record as a type of structure. Here’s a simple schema of a structure under SOAP.
App
A
350
Appendix A
SOAP Data Types and Data Type Conversions
Figure A.2 It pays to look at the various key declarations because they show how SOAP uses specific elements.
This schema defines a complex user-defined data type named Contact that contains three strings. Each string defines a particular element of the contact entry. Here’s this schema in action. George Smith 100 North Street (303)555-1212
As you can see, the schema packages the data and makes it easier to validate. The package arrives on the other end of the wire in a format that’s easy to parse into a database. While the names within the structure are significant, the order of the elements isn’t. The SOAP message can contain the structure elements in any order so long as the sender doesn’t violate any of the structure schema rules. Structures can use all of the normal XML extras such as identifiers and references. You create an identifier using the id attribute within the SOAP message like this . The reference uses the familiar href like this, . Arrays allow you to access individual elements of grouped data. Unlike many programming languages, SOAP doesn’t force you to use the same data type for all array elements. You can also create ragged arrays with a little creative programming. Here’s an example of a simple array. 3 ”Hello World” true
Differences in Implementation
351
The use of the ur-type allows you to define an array containing elements of mixed types. You can also use a simple method to create an array of the same elements without having to define each element type individually. Here’s an example of an array containing just one type. 3 4 5
You can create arrays of arrays or structures. Like structures, you can also include identifiers and references in your array as long as you meet any required schema restrictions. In short, arrays allow you to move data in a way that allows easy access of individual data elements. You can use arrays in SOAP as you would in any programming language, with the appropriate caveats, of course. While many SOAP toolkits support some form of array, some don’t support anything more than single dimension arrays now. In addition, you’ll find that support for ragged arrays is lacking. Of course, you can always go the hand-coding route if you want to implement a solution today that the usual assortment of toolkits doesn’t support.
Differences in Implementation Now that you understand the data types, it’s time to consider how vendors implement these data types. The technique used implement the data type determines where the type definitions appear within the SOAP message. There are two common methods in use by most SOAP toolkit vendors now. ■
Define the data types as part of the elements in the SOAP message.
■
Define the data types as part of an external file that the listener or XML parser uses to decode the SOAP message.
Let’s begin by looking at the first method for defining data types. Many developers feel that using external files such as WSDL unnecessarily complicates what should be a simple protocol for exchanging data. Here’s a simple SOAP message that contains data type information as part of the message. Mueller John mssoapinit(_T(“http://WinServer/ssss4c/soap.asp?WSDL”), _T(“demoService”), _T(“ServInfoPort”), _T(“”)); if(FAILED(hr)) MessageBox(“Cannot initialize SoapClient. “, “Error”, MB_OK | MB_ICONEXCLAMATION);
// Prepare the method variable. pMethodName = L”GetServerInfo”; // Create a mapping of the dispatch ID corrisponding to the method name. hr = m_pSoapClient->GetIDsOfNames(IID_NULL, &pMethodName, 1, LOCALE_SYSTEM_DEFAULT, &dispid); if(FAILED(hr)) { MessageBox(“Cannot get dispatch id of calc method.”, “Error”, MB_OK | MB_ICONEXCLAMATION); return; } // Initialize DISPPARAMS structure. No parameters are required, so set // the number of arguments to 0. dispparams.cArgs = 0; dispparams.cNamedArgs = 0;
Creating the Client
Listing D.2
403
CONTINUED
dispparams.rgdispidNamedArgs = NULL; // Prepare result variant. VariantInit(&result); // Invoke the specified method. hr = m_pSoapClient->Invoke(dispid, IID_NULL, LOCALE_SYSTEM_DEFAULT, DISPATCH_METHOD, &dispparams, &result, &ExceptInfo, NULL); if(FAILED(hr)) { MessageBox(“Invoke of calc method failed.”, “Error”, MB_OK | MB_ICONEXCLAMATION); } else { // Display result. ParamText = result.bstrVal; MessageBox(ParamText, “Success”, MB_OK | MB_ICONINFORMATION); } // Clean up variants. VariantClear(&result); }
If you looked at the other examples in the book, you’ll notice the Visual C++ version is much longer and more complex. However, it also provides you with precise control over the client. The use of Visual C++ can change how the client interacts with the SOAP application. For example, you have better control over the way output parameters are translated into a form that SOAP can understand. The client application works much the same as the Visual Basic examples. It begins by instantiating a client object. The object uses mssoapinit() to create a connection with the server. Notice that you must provide all four arguments when working with Visual C++. Make certain that you set any unneeded parameters to a null string value as shown in the example. Otherwise, SOAP will claim that it can’t create the connection for you. However, once you create the connection, you’ll follow a different process than Visual Basic. The first step is to gain access to the dispatch ID of the component. You do this by specifying the name of the method you want to use. Notice that the first argument for the GetIDsOfNames() method is set to IID_NULL. This value is currently reserved and you must set the argument value to IID_NULL for every case. You can pass in either a single method name or an array of names. The third argument sets the number of methods to return. If you
App
D
404
Appendix D
SOAP for Visual C++ Developers
request an array of methods, then the dispatch ID variable will contain an array of dispatch IDs on return. Otherwise, it contains a single dispatch ID that you can access directly. This example doesn’t require the use of any arguments. If it had required parameters (as the Calc example on the Web site), then you would have created an array of arguments as the next step in the process. The array must be of type VARIANT and you must set the correct variant type for each argument. Once you create this array, it’s referenced by the dispatch parameters structure. Notice that the values in the dispparams structure for the example are set to 0 because you have no arguments. Normally, you’ll set this value to a number that corresponds to the number of entries in the argument array. Calling the method comes next. You’ll use the client’s Invoke() method to perform this task. Notice that you’ll need to pass the dispatch ID of the method that you want to call; not the entire array of dispatch IDs. The three most important arguments are the dispatch parameters structure, the result variable (another VARIANT), and the exception variable. On return, the result variable contains the server information requested. Obviously, you’ll want to clean up any variables you create once you retrieve the information from the server. Now that you’ve looked at the application, it’s time to compile and test it. Figure D.7 shows the results received from the example. Your results will differ depending on the name and setup of your server. Figure D.7 The server information application provides details on the name of the server and the operating system it uses.
Handling SOAP Errors SOAP provides the same error handling capability using the client with Visual C++ as it does for Visual Basic. The method used to access the data is slightly different, but will look familiar to anyone who has worked with Visual C++ in the past. As part of the task of learning about error handling for Visual C++, I decided to create a clone of the Visual Basic example in Chapter 5, “Migrating an Application form DCOM to SOAP.” The DisplaySOAPFault() method shown here provides similar functionality to that example. (You can find the complete example in the \Appendix D\Throw Error directory of the Web site for this book at www.quepublishing.com.) void CThrowErrorDlg::DisplaySOAPFault(LPCTSTR pMessage) {
Handling SOAP Errors
HRESULT hr; BSTR FaultString; CString ErrorMsg;
405
// Result of call. // Fault string holder. // Final error message.
// Begin building a message. Each method call will // retrieve a different part of the fault string. ErrorMsg = pMessage; ErrorMsg = ErrorMsg + “\n\rFault Code: “; hr = m_pSoapClient->get_faultcode(&FaultString); ErrorMsg = ErrorMsg + FaultString; ErrorMsg = ErrorMsg + “\n\rFault String: “; hr = m_pSoapClient->get_faultstring(&FaultString); ErrorMsg = ErrorMsg + FaultString; ErrorMsg = ErrorMsg + “\n\rFault Actor: “; hr = m_pSoapClient->get_faultactor(&FaultString); ErrorMsg = ErrorMsg + FaultString; ErrorMsg = ErrorMsg + “\n\rFault Detail: “; hr = m_pSoapClient->get_detail(&FaultString); ErrorMsg = ErrorMsg + FaultString; // Free the string used to retrieve the data. SysFreeString(FaultString); // Display the error message. MessageBox(ErrorMsg, “Error”, MB_OK | MB_ICONEXCLAMATION); }
As you can see, many of the calls are the same as the Visual Basic example, but use syntax that only a C++ programmer could love. The major concern is ensuring you free the fault string once you’re finished collecting data. The example code relies on the SysFreeString() call to perform the task. Figure D.8 shows the output from the ThrowError application. Figure D.8 The ThrowError application outputs the same error information as the example in Chapter 5.
Notice that each call in the DisplaySOAPFault() method produces a result value. This actually makes the Visual C++ detection scheme better than what you get with Visual Basic. You can check the HRESULT after each call to ensure you still get valid information. Given the nature of SOAP applications, the connection could break between calls. By detecting the HRESULT value, you can provide the user with partial SOAP feedback, plus some additional information about the cause of eventual connection failure. Here’s some code you could use to handle the HRESULT values. LPVOID Cstring
lpMsgBuf; // Message Buffer Msg; // Resulting Error Message
App
D
406
Appendix D
SOAP for Visual C++ Developers
// Create the message. FormatMessage( FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, NULL, hr, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language (LPTSTR) &lpMsgBuf, 0, NULL ); // Convert the message into the proper format and display it. Msg = (LPCTSTR)lpMsgBuf; MessageBox(Msg, “Error”, MB_OK | MB_ICONEXCLAMATION); // Free the memory used by the message buffer. LocalFree(lpMsgBuf); return;
As you can see, all you need is a buffer and the HRESULT value to pass to the FormatMessage() function. Since the return value is a void pointer, you’ll need to convert it into a string. After the data is converted, you can display it as normal. The last step is to free the message buffer.
GLOSSARY
408
Glossary
This book includes a Glossary so that you can find terms and acronyms easily. It has several important features of which you need to be aware. First, every acronym in the entire book is listed here (even those you may already know). This way there is no doubt that you’ll be able to find everything you need to use the book properly. Second, these definitions are specific to the book. In other words, when you look through this glossary, you’re seeing the words defined in the context in which the book uses them. This might or might not always coincide with current industry usage because the computer industry changes the meaning of words so often. Finally, the definitions here use a conversational tone in most cases. This means they might sacrifice a bit of puritanical accuracy for the sake of better understanding. The purpose of this glossary is to define the terms in such a way that there’s little room for misunderstanding the intent of the book as a whole. Although this Glossary is a complete view of the words and acronyms in the book, you’ll run into situations when you need to know more. No matter how closely I look at terms throughout the book, it’s always possible I might miss the one acronym or term that you really need to know. In addition, I’ve directed your attention to numerous online sources of information. Few of the terms the Web site owners use will appear here unless I also chose to use them in the book. Fortunately, many sites on the Internet provide partial or complete Glossaries to fill in the gaps: ■
Acronym Finder (http://www.acronymfinder.com/)
■
Microsoft Encarta (http://encarta.msn.com/)
■
University of Texas Acronyms and Abbreviations (http://wwwhep.uta.edu/~variable/e_comm/pages/r_dic-en.htm)
■
Webopedia (http://webopedia.internet.com/)
■
yourDictionary.com (formerly A Web of Online Dictionaries) (http://www. yourdictionary.com/)
Let’s talk about these Web sites a little more. Web sites normally provide acronyms or glossary entries—not both. An acronym site only provides the definition for the acronym that you want to learn about; it doesn’t provide an explanation of what the acronym means concerning everyday computer use. The two extremes in this list are Acronym Finder (acronyms only) and Webopedia (full-fledged glossary entries). The owner of Acronym Finder doesn’t update the site as often as the University of Texas, but Acronym Finder does have the advantage of providing an extremely large list of acronyms from which to choose. At the time of this writing, the Acronym Finder sported 164,000 acronyms. The University of Texas site receives updates often and provides only acronyms. (Another page at the same site includes a Glossary.) Most of the Web sites that you’ll find for computer terms are free. In some cases, such as Microsoft’s Encarta, you have to pay for the support provided. However, these locations are still worth the effort because they ensure you understand the terms used in the jargon-filled world of computing.
Glossary
Webopedia has become one of my favorite places to visit because it provides encyclopedic coverage of many computer terms and includes links to other Web sites. I like the fact that if I don’t find a word I need, I can submit it to the Webopedia staff for addition to their dictionary, making Webopedia a community-supported dictionary of the highest quality. One of the interesting features of the yourDictionary.com Web site is that it provides access to more than one dictionary and in more than one language. If English isn’t your native tongue, then this should be your Web site of choice. Now that we have the preliminaries out of the way, it’s time to provide some definitions. Active Directory Services Interface (ADSI) A set of APIs used to access Active Directory, the central repository of information in Windows 2000. Active Directory is a hierarchical database used to store many types of information in a somewhat freeform format. ADSI allows access to both Active Directory data and the schema, which means you can use it to create new database elements, as well as remove elements that are no longer in use. Active Server Page (ASP) A special type of scripting language used by Windows NT Server equipped with Internet Information Server (IIS). This specialized scripting language allows the programmer to create very flexible Web server scripts. The use of variables and other features, such as access to server variables, allows a programmer to create scripts that can compensate for user and environmental needs as well as security concerns. ASP pages use HTML to display content to the user. Adaptive Network Security Alliance (ANSA) A proposed specification that ensures interoperability between security vendor application programming interfaces (APIs) using common interfaces for inspecting content. Competing standards include Open Platform for Secure Enterprise Connectivity (OPSEC) Alliance and Common Content Inspection (CCI). This product targets the Internet Security System (ISS) Group’s intrusion detection software. Active Server Page (ASP) A special type of scripting language used by Windows servers equipped with Internet Information Server (IIS). This specialized scripting language allows the programmer to create flexible Web server scripts. The use of variables and other features, such as access to server variables, allows a programmer to create scripts that can compensate for user and environmental needs as well as security concerns. ASP uses HTML to display content to the user. ADSI
See Active Directory Services Interface.
American Standard Code for Information Interchange (ASCII) A standard method of equating the numeric representations available in a computer to human-readable form. The number 32 represents a space, for example. The standard ASCII code contains 128 characters (7 bits). The extended ASCII code uses 8 bits for 256 characters. Display adapters from the same machine type usually use the same upper 128 characters. Printers, however, might reserve these upper 128 characters for nonstandard characters. Many Epson printers use them for the italic representations of the lower 128 characters, however. ANSA
See Adaptive Network Security Alliance.
409
GL
410
Glossary
Application The complete program or group of programs. An application is a complete environment for performing one or more related tasks. Application Service Provider (ASP) A vendor who provides downloadable or remotely accessible service-oriented code using an Internet connection. Developers can use these objects within their own code to obtain services from the ASP vendor. For example, an airline reservation system could rely on ASP modules from each of the major carriers to provide pricing and reservation support. ASCII
See American Standard Code for Information Interchange.
ASP
See Active Server Page.
ASP
See Application Service Provider.
AuthXML An XML-based technology that allows session authentication and authorization using an Internet connection. Companies can also use this technology to exchange user authentication lists and to verify the identity of remote users. Bandwidth
A measure of the amount of data a device can transfer in a given time.
Berkeley Internet Name Domain (BIND) This protocol allows Domain Name Service (DNS) servers to translate human-readable Web site names to IP addresses. BIND was originally designed for use on UNIX systems. Binary A method used to store worksheets and graphics files. Although you can use the DOS TYPE command to send these files to the display, the contents of the file remain unreadable. Other binary files include programs with extensions of EXE or COM. Binary Compatibility Normally refers to two versions of an object that provide the same interfaces at the lowest possible level. This term is also used generally to refer to two objects of any type that provide low-level compatibility in some way. In some cases, the term refers to two technologies and indicates the amount of low-level similarity the two technologies provide. Binary Large Object (BLOB) A special field in a database table that accepts objects such as bitmaps, sounds, or text as input. This field is normally associated with the OLE capabilities of a DBMS, but some third-party products make it possible to add BLOB support to older database file formats, such as Xbase DBF file format. BLOB fields always imply OLE client support by the DBMS. BIND
See Berkeley Internet Name Domain.
Biometrics A statistical method of scanning an individual’s unique characteristics, normally body parts, to ensure that an individual is who he says he is. Some of the scanned elements include voiceprints, irises, fingerprints, hands, and facial features. The two most popular elements are irises and fingerprints because they’re the two that most people are familiar with. The advantages of using biometrics are obvious. It not only prevents the user from losing his identifying information (at least not very easily), but with proper scanning techniques, it also protects the identifying information from being compromised.
Glossary
See Binary Large Object.
BLOB
Buffer The area in memory where program variable or other data is stored. For example, applications will normally read more than one page from a word processed document to improve performance. The applications store pages in addition to the one currently viewed by the user in the buffer until needed. Cascading Style Sheets (CSS) A method for defining a standard Web page template. This might include headings, standard icons, backgrounds, and other features that would tend to give each page at a particular Web site the same appearance. The reason for using CSS includes speed of creating a Web site (it takes less time if you don’t have to create an overall design for each page) and consistency. Changing the overall appearance of a Web site also becomes as easy as changing the style sheet instead of each page individually. CCI
See Common Content Inspection.
CDATA CDSA
See Character Data Section. See Common Data Security Architecture.
Character Data (CDATA) Section Used to prevent interpretation of part of an XML message by the XML parser. A CDATA section normally transports characters that the XML parser would strip, mangle, modify, or simply change in undesirable ways. For example, this is one way to get around the problems with transferring carriage return/linefeed pairs in a message. Character Set A general reference to a representation of printable or abstract symbols. Character sets are normally encoded numeric forms of the symbology suitable for computer storage. Class ID (CLSID) A method of assigning a unique identifier to each object in the registry. Also refers to various high-level language constructs. CLSID COM
411
See Class ID. See Component Object Model.
Common Content Inspection (CCI) A specification proposed by Aventail, Corp; Finjan Software, Ltd.; and Check Point Software Technologies to ensure interoperability between security vendor application programming interfaces (APIs). The CCI API seeks to promote interoperability using common interfaces for inspecting content. Competing standards include Adaptive Network Security Alliance (ANSA) and Open Platform for Secure Enterprise Connectivity (OPSEC) Alliance. Common Data Security Architecture (CDSA) A comprehensive set of security services that will make secure transactions easier. It has a four-layer architecture: application, layered services and middleware, Common Security Services Manager (CSSM) infrastructure, and security service provider modules. CDSA is currently on The Open Group fast track to becoming a standard.
GL
412
Glossary
Common Object Request Broker Architecture (CORBA) The purpose of this protocol is to describe data and application code in a way that a variety of computer types can use. It will eventually allow you to go to a Web page and download a mini-application (applet) as part of that page. This is the Object Management Group’s (OMG) alternative to Microsoft’s ActiveX. IBM originally designed CORBA for inclusion with OS/2, but other companies, such as Sun Microsystems, now support this standard as well. Component Object Model (COM) A Microsoft specification for an object-oriented code and data encapsulation method and transference technique. It’s the basis for technologies such as OLE (object linking and embedding) and ActiveX (the replacement name for OCXs, an object-oriented code library technology). COM is limited to local connections. DCOM (distributed component object model) is the technology used to allow data transfers and the use of OCXs within the Internet environment. Connectivity A measure of the interactions between clients and servers. In many cases, connectivity begins with the local machine and the interactions between applications and components. Local area networks (LANs) introduce another level of connectivity with machine-to-machine communications. Finally, wide area networks (WANs), metropolitan area networks (MANs), intranets, and the Internet all introduce further levels of connectivity concerns. Convent Vectoring Protocol (CVP) A proposed specification that ensures interoperability between security vendor application programming interfaces (APIs) using common interfaces for inspecting content. Competing standards include Adaptive Network Security Alliance (ANSA) and Common Content Inspection (CCI). This specification targets Check Point’s Firewall-1. CORBA
See Common Object Request Broker Architecture.
Cracker A hacker (computer expert) who uses his skills for misdeeds on computer systems where he has little or no authorized access. A cracker normally possesses specialty software that allows easier access to the target network. In most cases, a cracker requires extensive amounts of time to break the security for a system before he can enter it. CSS
See Cascading Style Sheets.
CVP
See Convent Vectoring Protocol. See Distributed Authentication Security Service.
DASS
Data Conversion The act of changing data from one format to another. The success of a data conversion depends on how well the converted data models the original data. DCE
See Distributed Computing Environment.
DCOM
See Distributed Component Object Model.
Digital Signatures Initiative (DSI) A standard originated by the W3C (World Wide Web Consortium) to overcome limitations of channel-level security. For example, channel-level security can’t deal with documents and application semantics. A channel also doesn’t use the
Glossary
Internet’s bandwidth very efficiently because all the processing takes place on the Internet rather than the client or server. This standard defines a mathematical method for transferring signatures—essentially a unique representation of a specific individual or company. DSI also provides a new method for labeling security properties (PICS2) and a new format for assertions (PEP). This standard is also built on the PKCS #7 and X509.v3 standards. DII
See Dynamic Invocation Interface. See Direct Internet Message Encapsulation.
DIME
Direct Internet Message Encapsulation (DIME) An Internet media of the dime/ application type. It encapsulates multiple application-defined entities also known as payloads into a single package. Each package can be of arbitrary size and type. The payload description includes data type, length, and an optional payload identifier. DISCO
See Discovery of Web Services.
Discovery of Web Services (DISCO) A service designed to make it easier to locate and use SOAP services. This particular service is SOAP specific and a single vendor, Microsoft, currently supports it. The DISCO service relies on a special protocol named SOAP Control Language (SCL) to allow the discovery of services by remote computers. Distributed Authentication Security Service (DASS) Defines an experimental method for providing authentication services on the Internet. The goal of authentication, in this case, is to verify who sent a message or request. Current password schemes have a number of problems that DASS tries to solve. For example, it’s impossible to verify that the sender of a password isn’t impersonating someone else. DASS provides authentication services in a distributed environment. Distributed environments present special challenges because users don’t log on to just one machine; they could conceivably log on to every machine on the network. Distributed Component Object Model (DCOM) The advanced form of the component object model (COM) used by the Internet. This particular format enables data transfers across the Internet or other non-local sources. It adds the capability to perform asynchronous as well as synchronous data transfers, which prevents the client application from becoming blocked as it waits for the server to respond. See Component Object Model for more details. Distributed Computing Environment (DCE) A specification created by the Open Software Foundation (OSF) that defines methods for data exchange between a client and server. The remote procedure call (RPC) support built into Windows NT is compatible with the DCE specification. Distributed System Object Model (DSOM) A full implementation of CORBA created by IBM that fulfills the same purpose as Microsoft’s DCOM standard. DSOM is a binary standard used on the network. The System Object Model (SOM) is the equivalent of COM on the local machine. DLL
See Dynamic Link Library.
413
GL
414
Glossary
See Domain Name System.
DNS
Document Object Model (DOM) A method for describing the object representation technique used within certain types of documents. Most people associate this term with Internet-based documents such as those found on Web sites. The DOM determines how the document presents objects such as links and text boxes. Document Type Definition (DTD) A document that defines how an application should interpret markup tags within an HTML, XML, or SGML document. In some cases, such as HTML, the DTD is an actual specification. In other cases, such as XML, the DTD is an external document supplied by the user or the vendor. A DTD can define every characteristic of a document as long as those characteristics are defined using standard tags and attributes. See Document Object Model.
DOM
Domain Name System (DNS) An Internet technology that allows a user to refer to a host computer by name rather than using its unique IP address. DSI
See Digital Signatures Initiative.
DSI
See Dynamic Skeleton Interface.
DSOM DTD
See Distributed System Object Model. See Document Type Definition.
Dynamic Invocation Interface (DII) An interface that allows a client direct access to the underlying request mechanisms for an ORB. This interface allows applications to dynamically issue requests to objects without relying on the IDL interface-specific stubs. DII allows blocking RPC-style requests, non-blocking synchronous requests, and send-only calls. Dynamic Link Library (DLL) A specific form of application code loaded into memory by request. It’s not executable by itself. A DLL does contain one or more discrete routines that an application can use to provide specific features. For example, a DLL could provide a common set of file dialog boxes used to access information on the hard drive. More than one application can use the functions provided by a DLL, reducing overall memory requirements when more than one application is running. Dynamic Skeleton Interface (DSI) An interface that allows a server direct access to the underlying request mechanisms for an ORB. This interface allows servers to dynamically respond to object requests without compile-time knowledge of the object implementation. The client doesn’t know if it’s using DSI or RPC-style IDL Skeletons. DSI allows blocking RPC-style requests, non-blocking synchronous requests, and send-only calls. ebXML
See Electronic Business eXtensible Markup Language.
Electronic Business eXtensible Markup Language (ebXML) A group of specifications designed to standardize the use of XML globally. This effort will allow uniform business communications based on XML and provide a consistent method to exchange data, create relationships, and register business processes.
Glossary
Encode
The process of transforming a printable or abstract character into a coded format.
eXtensible Hypertext Markup Language (XHTML) A cross between XML and HTML specifically designed for Net devices. Because this language relies on XML, most developers classify it as an XML application builder. The language relies on several standardized namespaces to provide common data type and interface definitions. XHTML creates modules that are interpreted based on a specific platform’s requirements. This means that a single document can serve the needs of many display devices. eXtensible Markup Language Protocol (XMLP) An online communication protocol alternative to SOAP. XMLP is still in the proposal stage, so there’s little information about it. However, XMLP proponents state that it will provide a simplified method for transferring data while extending the capabilities of protocols, such as SOAP. eXtensible Markup Language-Remote Procedure Call (XML-RPC) A predecessor to SOAP that allows data exchange between two systems. XML-RPC relies on HTML type organization within an XML framework. It provides some advanced features, such as complex data type and array support. However, XML-RPC doesn’t provide support for such crucial features as namespaces. eXtensible Modeling Language (XML) A standardized Web page design language used to incorporate data structuring within standard HTML documents. For example, you could use XML to display database information using something other than forms or tables. It’s actually a lightweight version of standardized generalized markup language (SGML) and is supported by the SGML community. XML will also support tag extensions that will allow various parts of a Web-based application to exchange information. For example, after a user makes a choice within a catalog, that information can be added to an order entry form with a minimum of effort on the part of the developer. Because XML is easy to extend, some developers look at it as more of a base specification for other languages, rather than a complete language. eXtensible Rights Markup Language (XRML) A ContentGuard specification that defines how a third party can use content provided by a host company. It describes the rights, fees, and conditions of content. XRML also allows a vendor to define trusted systems that can use a product for testing and evaluation purposes. This technology relies on a trusted server to determine if someone can access content and what rights that person has when he does. FAQ
415
See Frequently Asked Question.
Firewall A system designed to prevent unauthorized access to or from a network. Firewalls are normally associated with Web sites connected to the Internet. A network administrator can create a firewall using either hardware or software. Frequently Asked Question (FAQ) A document that contains answers to questions that many people ask. FAQs generally reduce support costs by providing answers to commonly asked questions in one place. Vendors now use FAQs for many purposes, including both hardware and software support.
GL
416
Glossary
Globally Unique Identifier (GUID) A 128-bit number used to identify a component object model (COM) object within the Windows registry. The GUID is used to find the object definition and allow applications to create instances of that object. GUIDs can include any type of object—even non-visual elements. In addition, some types of complex objects are actually aggregates of simple objects. For example, an object that implements a property page will normally have a minimum of two GUIDs: one for the property page and another for the object. See Globally Unique Identifier.
GUID
Hacker An individual who works with computers at a low level, especially in the area of security. A hacker normally possesses specialty software that allows easier access to the target application or network. In most cases, hackers require extensive amounts of time to break the security for a system before they can enter it. The two types of hackers include those that break into systems for ethical purposes and those that do it to damage the system in some way. The proper term for the second group is crackers. Some people have started to call the first group “ethical hackers” to prevent confusion. Ethical hackers normally work for security firms that specialize in finding holes in a company’s security. However, hackers work in a wide range of computer arenas. For example, a person who writes low-level code (like that found in a device driver) after reverse engineering an existing driver is technically a hacker. HTTP
See Hypertext Transfer Protocol.
HTTP Extension Framework
]See Hypertext Transfer Protocol Extension Framework.
Hypertext Transfer Protocol (HTTP) One of several common data transfer protocols for the Internet. This particular protocol specializes in the display of onscreen information, such as data entry forms or information displays. HTTP relies on HTML as a scripting language for describing special screen display elements, although you can also use HTTP to display non-formatted text. Hypertext Transfer Protocol Extension Framework (HTTP Extension Framework) Used with non-standard extensions to the HTTP header. The HTTP Extension Framework describes which non-standard extensions a document contains. It also determines how the recipient should handle them. Common uses of an HTTP Extension Framework include protocols such as SOAP. IANA
Internet Assigned Numbers Authority. See Interface Definition Language.
IDL IETF
See Internet Engineering Task Force.
IIOP
See Internet Inter-ORB Protocol.
IIS
See Internet Information Server.
Infrastructure The underlying base of an organization or system. One way to view infrastructure is as the foundation on which all other elements of a system or organization are attached. Many vendors use this term to indicate the compatibility of their product with existing installations.
Glossary
Interface Definition Language (IDL) A programming language construct used to define the interfaces, methods, and parameters of a class. The IDL might use attributes to describe some elements fully using a common methodology. In addition, the IDL normally includes binary elements, such as interface identifiers. For example, COM relies on globally unique identifiers (GUIDs) for identification purposes. Internet Engineering Task Force (IETF) The standards group is tasked with finding solutions to pressing technology problems on the Internet. This group can approve standards created both within the organization itself and outside the organization as part of other group efforts. For example, Microsoft has requested the approval of several new Internet technologies through this group. If approved, the technologies would become an Internet-wide standard performing data transfer and other specific kinds of tasks. Internet Information Server (IIS) Microsoft’s full-fledged Web server that normally runs under the Windows NT Server operating system. IIS includes all the features that you would normally expect with a Web server: FTP, HTTP, and Gopher protocols along with both mail and news services. Both Windows NT Workstation and Windows 95 can run Personal Web Server (PWS), which is a scaled-down version of IIS. Internet Inter-ORB Protocol (IIOP) A binary protocol that the Open Management Group (OMG) designed for Internet use. This makes IIOP different from protocols such as DCOM and CORBA that are designed for LAN use only. IIOP does perform better on the Internet than DCOM or CORBA, but it has the same problems as DCOM and CORBA in that it doesn’t communicate well through firewalls. Internet Server Application Programming Interface (ISAPI) A set of function calls and interface elements designed to make using Microsoft’s Internet Information Server (IIS) and associated products such as Peer Web Server easier. Essentially, this set of API calls provides the programmer with access to the server. Such access makes it easier to provide full server access to the Internet server through a series of ActiveX controls without the use of a scripting language. ISAPI comes in two forms: filters and extensions. An extension replaces current script-based technologies, such as CGI. Its main purpose is to provide dynamic content to the user. A filter can extend the server by monitoring various events like user requests for access in the background. You can use a filter to create various types of new services, such as extended logging or specialized security schemes. Interoperability A measure of an application’s ability to run in more than one environment, compatible or not. This term often refers to the ability of an application to run on more than one operating system or hardware platform. In some cases, this term refers to middleware’s ability to overcome interoperability problems between platforms. ISAPI
See Internet Server Application Programming Interface.
Java Document Object Model (JDOM) An object description technique that combines the best features of the Simple API for XML (SAX) and the Document Object Model (DOM). Few XML parsers currently support this new standard. JDOM
See Java Document Object Model.
417
GL
418
Glossary
LAN
See Local Area Network.
Local Area Network (LAN) Two or more devices connected together using a combination of hardware and software. The devices, normally computers and peripheral equipment such as printers, are called nodes. An NIC (network interface card) provides the hardware communication between nodes through an appropriate medium (cable or microwave transmission.) There are two common types of LANs (also called networks). Peer-to-peer networks allow each node to connect to any other node on the network with shareable resources. This is a distributed method of files and peripheral devices. A client-server network uses one or more servers to share resources. This is a centralized method of sharing files and peripheral devices. A server provides resources to clients (usually workstations). The most common server is the file server, which provides file-sharing resources. Other server types include print servers and communication servers. MAN
See Metropolitan Area Network.
Mathematical Markup Language (MathML) An XML-based technique for describing math notation. This includes both the structure and the content of the notation. It allows standardized processing of complex equations over the Internet. MathML
See Mathematical Markup Language.
Message Transfer Agent (MTA) This is an X.400 standard term that refers to the part of a message transfer system (MTS) responsible for interacting with the client. For example, in an e-mail system, the MTA delivers e-mail to the individual users of that system. Metropolitan Area Network (MAN) A partial extension and redefinition of the WAN, a MAN connects two or more LANs together using a variety of methods. A MAN usually encompasses more than one physical location within a limited geographical area, usually within the same city or state. (A WAN can cover a larger geographical area, and sometimes includes country-to-country communications.) Most MANs rely on microwave communications, fiber-optic connections, or leased telephone lines to provide the internetwork connections required to keep all nodes in the network talking with each other. Microsoft Management Console (MMC) A special application that acts as an object container for Windows management objects, such as Component Services and Computer Management. The management objects are actually special components that provide interfaces that allow the user to access them within MMC to maintain and control the operation of Windows. A developer can create special versions of these objects for application management or other tasks. Using a single application like MMC helps maintain the same user interface across all management applications. Microsoft SOAP Messaging Object (SMO) A combination of component and wizard that reduces the effort required to write SOAP-enabled components. It allows you to simulate the existing data exchange environment within the client. The client and server component code remain about the same using this method because the major change occurs with the listener and SMO code you need to write.
Glossary
MIME MMC MTA
See Multipurpose Internet Mail Extensions. See Microsoft Management Console. See Message Transfer Agent.
Multipurpose Internet Mail Extensions (MIME) The standard method for defining the content of Internet messages. This standard allows computers to exchange objects, character sets, and multimedia using e-mail without regard to the computer’s underlying operating system. MIME is defined in the IETF RFC1521 standard. Namespace A method of organizing methods and other programming library resources into easily accessible groups. Each group performs a given task or set of tasks on a particular object type. For example, a namespace might contain several methods associated with the file system on a computer. Although namespaces normally represent a means of organizing methods, doing so is not required. Some developers use namespaces as a safe method for placing methods in a container without regard to use. Namespace Identifier (NID) A special number that identifies a particular company’s resources. The NID is used with URNs to ensure uniqueness. Network Interface Card (NIC) The device responsible for allowing a workstation to communicate with the file server and other workstations. It provides the physical means for creating the connection. The card plugs into an expansion slot in the computer. A cable that attaches to the back of the card completes the communication path. NIC
See Network Interface Card.
NID
See Namespace Identifier.
Object Request Broker (ORB) The component in CORBA that acts as middleware between a client and server. The client makes a request of the ORB without knowing anything about the server used to answer the request. Likewise, the server responds to the request without knowing anything about the client. OMG
See Open Management Group.
Open Management Group (OMG) A consortium of more than 700 companies. The goal of this consortium is to define better object-oriented programming methodologies. OMG created the CORBA specification. ORB
See Object Request Broker.
OSF
Open Software Foundation.
Parse To reduce a long label to its component parts. Spreadsheets normally break words and numbers apart using the spaces between them as the break point. You can change how spreadsheets parse a label by changing the format line. PDA
419
See Personal Digital Assistant.
GL
420
Glossary
Peer-to-Peer Network A group of connected computers in which every computer can act as a server and a client. Selected computers normally provide services to others, but unlike a client/server network, the network administrator can distribute the processing load over several machines. In addition, all nodes of a peer-to-peer network also act as workstations. PEM1
See Privacy Enhanced Mail Part I.
Personal Digital Assistant (PDA) A very small PC normally used for personal tasks such as taking notes and maintaining an itinerary during business trips. PDAs normally rely on special operating systems and lack standard application support. POST
See Power-On Self Test.
Power-On Self Test (POST) The set of diagnostic and configuration routines that the BIOS runs during system initialization. For example, the memory counter you see during the boot sequence is part of this process. Privacy Enhanced Mail Part I (PEM1) A specification that defines methods for encrypting mail in a way that protects the user’s identity but allows decrypting in the background. This includes the use of keys and other forms of certificate management. Some of the specification is based on the CCITT X.400 standard. Protocol A set of rules used to define a specific behavior. For example, protocols define how networks transfer data. Think of a protocol as an ambassador who negotiates activities between two countries. Without the ambassador, communication is difficult, if not impossible. Proxy When used in the COM sense of the word, a proxy is the data structure that takes the place of the application within the server’s address space. Any server responses to application requests are passed to the proxy, marshaled by COM, and then passed to the application. RDDL
See Resource Directory Description Language.
Remote Access The ability to use a remote resource as you would a local resource. In some cases, this also means downloading the remote resource to use as a local resource. Remote Method Invocation (RMI) A relatively simple wire protocol designed to support Java. RMI is a binary protocol like DCOM and CORBA and is built upon a CORBA base. This protocol won’t support platforms other than Java. Remote Procedure Call (RPC) One of several methods for accessing data within another application. RPC is designed to look for the application first on the local workstation, and then across the network at the applications stored on other workstations. This is an advanced capability that will eventually pave the way for decentralized applications. Resource Directory Description Language (RDDL) This specification shows how an organization could use a URL that it owns to point to an XHTML document that contains a list of resources the company wants to make accessible. The document contains a description of the resource in human-readable form and embeds the required machine information as part of the description.
Glossary
Rivest Shamir Adleman algorithm (RSA) An authentication technology that relies on a private-public key pair to create a set of credentials. The credentials are then used as a means of identification for logging into various network resources. Using this methodology allows for secure data transmission as well as user-oriented features, such as one password login to the network. RMI
See Remote Method Invocation.
RPC
See Remote Procedure Call.
RSA
See Rivest Shamir Adleman algorithm.
S/MIME S2ML SAX
See Secure/Multipurpose Internet Mail Extensions.
See Security Service Markup Language. See Simple API for XML.
Scalability A definition of an object’s ability to sustain increases in load. For example, companies often rate networking systems by their ability to scale from one to many users. Software scalability determines the ability of the software to run on more than one machine when needed without making it appear that more than one machine is in use. Scalable Vector Graphics (SVG) A vector-based method of describing a graphic using XML. Vector graphics are infinitely scalable because they rely on math definitions, rather than bitmaps. They also require less storage space than bitmap graphics. Unfortunately, SVG requires more display time, processing power, and resources. They also require a special XML parser and display application. SCM
See Service Control Manager.
Secure Hypertext Transfer Protocol (SHTTP) A technology designed to encrypt messages sent using the Internet. This technology is similar in purpose to Security Sockets Layer (SSL). However, SSL secures the connection between two computers, while SHTTP secures the individual messages. It’s possible to use both technologies together to provide enhanced security. Secure Socket Layer (SSL) A digital signature technology used for exchanging information between a client and a server. Essentially, an SSL compliant server will request a digital certificate from the client machine. The client can likewise request a digital certificate from the server. Companies or individuals obtain these digital certificates from a third-party vendors such as VeriSign, who can vouch for the identity of both parties. Secure/Multipurpose Internet Mail Extensions (S/MIME) A secure method to transfer attachments and other message elements on the Internet. S/MIME supports RSA’s public key encryption technology. See Multipurpose Internet Mail Extensions for additional details. Security Service Markup Language (S2ML) Provides a secure method for companies to exchange information about transactions and customers. This protocol is commonly used in business-to-business and business-to-customer environments. This technology relies on XML as a basis for communicating data in a transparent manner.
421
GL
422
Glossary
Service Control Manager (SCM) The SCM is part of the load balancing technology used by Windows servers. When a client makes a DCOM call to the load-balancing router, it’s the SCM that actually receives the request. The SCM looks up the component in the load-balancing router table, then makes a DCOM call to one of the servers in the application cluster to fulfill the request. The server in the application cluster creates an instance of the request object, then passes the proxy for it directly to the client. At this point, the server and the client are in direct communication; the router is no longer needed. See Standard Generalized Markup Language.
SGML SHTTP
See Secure Hypertext Transfer Protocol.
Simple API for XML (SAX) A less complex alternative to the Document Object Model (DOM). This API is more efficient than DOM and will allow you to create faster applications. Unfortunately, SAX is read-only, so it limits the number of tasks you can perform. SAX is also a developer-only specification; none of the standards groups such as W3C supports it. Simple Object Access Protocol (SOAP) A Microsoft-sponsored protocol that provides the means for exchanging data between COM and foreign component technologies like Common Object Request Broker Architecture (CORBA) using XML as an intermediary. Simple Object Access Protocol - Routing Protocol (SOAP-RP) A stateless protocol used to exchange one-way SOAP messages between a client and server. The protocol supports intermediary destinations. You can also include a return path that enables two-way communication between client and server. SOAP-RP requires a transport protocol such as TCP, UDP, or HTTP. Single Threaded Apartment (STA) A method of defining how object methods get executed. STAs include three restrictions not found in multi-threaded apartments (MTAs). The first is that an STA contains one, and only one, object. This ensures that once a component is instantiated, that the resulting object doesn’t share memory space with any other object, which could result in corruption. The second restriction is that one, and only one, thread can enter the apartment to interact with the object inside. The reason for this restriction is obvious. A single threaded object can only handle the requests of one thread at a time, which means that COM must protect the object from access by more than one thread. Ensuring that only one thread can enter the apartment at a time is the easiest way to accomplish this task. Finally, a thread can execute only one object method at a time. This restriction ensures that there won’t be any data corruption due to shared variables within the object. As a result of these restrictions, a single process could contain multiple STAs; one for each STA object that the application instantiated. Smart Card A type of user identification used in place of passwords. The use of a smart card makes it much harder for a third party to break into a computer system using stolen identification. However, a lost or stolen smart card still provides user access. The most secure method of user identification is biometrics. SMO
See Microsoft SOAP Messaging Object.
Glossary
See Simple Object Access Protocol.
SOAP
SOAP-RP
See Simple Object Access Protocol - Routing Protocol.
See System Object Model.
SOM SSL
See Secure Socket Layer.
STA
See Single Threaded Apartment.
Standard Generalized Markup Language (SGML) A specification for defining document format originally created for the publishing industry. Most developers consider SGML too complex for standard display purposes. However, both XML and HTML are based on SGML. SVG
See Scalable Vector Graphics.
System Object Model (SOM) An alternative object standard from IBM used with OS/2. The Workplace Shell uses SOM in place of OLE to create objects. TCP/IP
See Transmission Control Protocol/Internet Protocol.
Transmission Control Protocol/Internet Protocol (TCP/IP) A standard communication line protocol developed by the United States Department of Defense. The protocol defines how two devices talk to each other. Think of the protocol as a type of language used by the two devices. UDDI UDP
423
See Universal Description, Discovery, and Integration. See User Datagram Protocol.
Unicode Transformation Format (UTF) A standardized method of representing characters both printed and abstract using codes. Other forms of character representation include ASCII. Uniform Resource Identifier (URI) A generic term for all names and addresses that reference objects on the Internet. A URL is a specific type of URI. See Uniform Resource Locator (URL). Uniform Resource Locator (URL) A text representation of a specific location on the Internet. URLs normally include the protocol (http:// for example), the target location (world wide web or www), the domain or server name (mycompany), and a domain type (com for commercial). It can also include a hierarchical location within that Web site. The URL usually specifies a particular file on the Web server, although there are some situations where a Web server will use a default filename. For example, asking the browser to find http://www.mycompany.com, would probably display the DEFAULT.HTM file at that location. Uniform Resource Name (URN) A managed resource identifier. Being a managed resource means that a URN is guaranteed to be unique. Companies that want to use a URN must apply for a Namespace Identifier (NID) from an authority such as the Internet Assigned Numbers Authority (IANA). The NID appears as part of every resource reference that the company creates.
GL
424
Glossary
Universal Description, Discovery, and Integration (UDDI) A method of advertising application and other software-related services online. The vendor offering the service registers at one or more centralized locations. Clients wanting to use the service add pointers to the service to their application. URI
See Uniform Resource Identifier.
URL
See Uniform Resource Locator.
URN
See Uniform Resource Name.
User Datagram Protocol (UDP) Allows applications to exchange individual packets of information over a TCP/IP network. UDP uses a combination of protocol ports and IP addresses to get a message from one point of the network to another. More than one client can use the same protocol port as long as all clients using the port have a unique IP address. Protocol ports are of two types: well known and dynamically bound. The well-known port assignments use the ports numbered between 1 and 255. User Defined Type (UDT) A special construct supported by some programming languages that allows developers to create complex data types that closely mirror real-world environments. Most developers include UDTs when a language-supplied simple or complex type won’t fulfill a specific purpose. UTF
See Unicode Transformation Format.
W3C
World Wide Web Consortium.
WAN
See Wide Area Network.
Web Services Description Language (WSDL) A method for describing a service. The file associated with this description contains the service description, port type, interface description, individual method names, and parameter types. A WSDL relies on namespace support to provide descriptions of common elements, such as data types. Most WSDL files include references to two or more resources maintained by standards organizations to ensure compatibility across implementations. Wide Area Network (WAN) An extension of the local area network (LAN), a WAN connects two or more LANs together using a variety of methods. A WAN usually encompasses more than one physical site, such as a building. Most WANs rely on microwave communications, fiber-optic connections, or leased telephone lines to provide the internetwork connections required to keep all nodes in the network talking with each other. Wire Protocol A set of rules that govern the method for transferring data from one point to another across a network. Wire protocols can transfer data in either binary or text format. In addition, wire protocols can include additional features such as encryption and transactions. See Web Services Description Language.
WSDL XDR
See XML Data Reduced.
XHTML
See eXtensible Hypertext Markup Language.
Glossary
425
XLANG An automated process language developed by Microsoft. It is used to describe business processes within products such as BizTalk server, but could be used for other purposes. XML
See eXtensible Modeling Language.
XML Data Reduced (XDR) A subset of the standard eXtensible Markup Language (XML) schema. XDR is less complex than XML and therefore easier to implement. It concentrates on schema elements that the developers will most likely to use. Because XDR uses a reduced schema, it doesn’t provide the flexibility found in XML. XML Schema Definition (XSD) The portion of the XML specification that defines data types and other data elements. It’s also related to a Web site containing such information by use of XML parsers. XMLP
See eXtensible Markup Language Protocol.
XML-RPC XRML XSD
See eXtensible Markup Language-Remote Procedure Call.
See eXtensible Rights Markup Language. See XML Schema Definition.
GL
INDEX
Numbers 4S4C (Simons Soap Server Services for Com), 54 toolkit, creating clients, 400-404 features, 391-392 installing, 393-394 overview, 391 server-side components, creating, 395-400 Web site, 390 WSDL files, creating, 395-396
A A Little Interface Definition Language (ALIDI), 136 access, 162 See also remote access Acronym Finder Web site, 127 acronyms, identifying (online resources), 127 ActiveX, Web-based applications, 306-307 add-ons, PDAs, 324 AddIt client application form settings, 106-107 ALIDL (A Little Interface Definition Language), 136 analysis component, surveys, 216-218 Apache, support for international applications, 130 Apache Toolkit, 83 API (Application Programming Interface) Low-level, 86-87 functions GetComputerName( ), 293 GetComputerNameEx( ), 292-293 OpenSCManager( ), 176
applications client code, 106-109 clients, creating with 4S4C toolkit, 400-404 COM language binding, 136-137 complexity differences, 98 components, registering, 399 creating shortcuts, 100-102 data flow, 99-100 data type, testing, 254 data viewers creating client, 151-153 separating data viewing logic from main component, 148-151 server-side component, 147-148 updating, 145 databases, migrating, 153-155 debugging, 110-111 design basics, 99-100 differences and, 98-100 distributed compared to client/server, 127 efficiency considerations, 124 error handling, 404-406 input, testing surveys, 214-216 integrating modules, 134-136 international, support for, 130 migrating changing modules, 126-127 error handling, 131-134 locating protocol problems, 130-131 performance concerns, 158-159 prioritizing development, 128-129 problems with, 155 protocols, 129-130 reliability concerns, 156-157 security concerns, 157 troubleshooting, 159-160 MIME and, 234-236 modules, testing, 134
naming components, URIs and, 126 output, surveys, 218-221 partner access, 45 PDAs Complex Type Example, testing, 331-332 Computer Name example, testing, 338 performance, 117-120 remote access utilities, 163-165 satellite, 46 server-side code, 104-106 server-side components, 104-105 creating with 4S4C toolkit, 395-400 testing, 109-116 databases, remote access, 272-274 migrating to SOAP, 125 updating for SOAP use, 243 utilities creating local components, 142-144 migrating from DCOM, 140 updating client, 144-145 updating server-side components, 141-142 Web-based. See Web-based applications WSDL files, creating with 4S4C toolkit, 395-396 arguments, passing configuration strings as, 167 arrays, dynamic, databases and, 262 ASP files ISAPI Listener and, 179 scripts, paths, 37 Web-based applications, 316-317 ATL COM AppWizard projects, creating, 396-398
428
attachments
attachments, 57-58 MIME, 172 attributes elements and, 118-119 importance of, 398-399 AuthXML, 68
B bandwidth, 26 bar-code scanners, PDAs, 323 Base64 coding, XML transmissions and, 205 base64 data type, 347 binary compatibility, 25 binary data type, 347 binary protocols, 25 BIND (Berkeley Internet Name Domain), 65 binding, COM language binding, 136-137 biometrics, 75-77 BizTalk (Microsoft), 356-359 Editor utility, 359-362 Mapper utility, 363-365 Orchestration Designer utility, 365-368 reasons to use, 369-370 SOAP problems fixed, 368-369 utilities, 359 BLOB (binary large object) support, 257 block diagrams, planning migration, 124 boolean data type, 347
CDATA (character data), 204 character substitution, data transfer, 177 ClearLogEntries( ) method, 188 client code, applications, 106-109 client component, server status viewer, 181-182
ComplexType 1.0 Type Library, 251
client-side applications, databases, remote access, 267-272
ComplexType Component Source Code, 252
client-side components, 51 Computer Name example, PDAs, 333-335 creating data viewer, 151-153 PDAs, Complex Type Example, 329-331
CompName component source code, 293-294
client/server applications compared to distributed, 127 limitations of, 146-147 clients applications and, 99 thick clients, Web-based applications, 297-298 thin clients, 289-301 form views, 298-299 thick clients and, 291 Web page design, 299-301 updating, 144-145 CLR (Common Language Runtime), 102 CLSID (class ID), 18 cmdCompleted_Click( ) method, 271 cmdMakeEntry_Click( ) method, 190 cmdRemote_Click( ) source code, 181-182
Bray, Tim (XML), 198
cmdTask_Click( ) method, 271
browsers, compatibility, Webbased applications and, 287
code optimization, 119-120 toss-away, 172
byte data type, 347
C case studies reasons to use SOAP, 29-31 security, 78-79 SOAP solutions, 58-60 Web-based applications, 317-319
Complex Type Example, PDAs and, 329-332
client message type, 42
Box, Don, 346
bulletin boards, Web-based applications and, 286
complex data types, 245-254 server-side component, 251-252 SOAP Toolkit and, 87 UDTs, 248 WSDL generators, 247-248
color, PDAs, 340 COM (Component Object Model) GUIDs, 126 language binding, 136-137 COM+ components, remote access and, 167 company locators, Web-based applications and, 286 company policies, Web-based applications and, 286
component interactions, Webbased applications, 313-314 Component Object Model. See COM components access GUID, 53 calling, 51 client-side, 51 server-side, 51 Computer Name example, PDAs and, 332-338 configuration remote access and, 164 strings, passing as argument, 167 connection loss, Web-based applications, 304 connectivity, databases, 240 controls data entry forms, 203 surveys, 203 converting See also migration data types, 346, 353-354 converting data types, 245-247 CORBA (Common Objet Request Broker Architecture), 240 IIOP, 19-20 interoperability and, 171 SOAP and, 15, 19-20 crackers, 45, 62, 64 CreateObject( ) function, 301 cross-platform support, SOAP Toolkit, 88 CSI (Computer Security Institute), security issues, 307 CSS (cascading stylesheets)m data entry forms, 233
document style WSDL files
D DASS (Distributed Authentication Security Service), 68 data compression, databases and, 239 data encryption, Masker 2.0, 374-376 data entry controls, 203 formatting, XML and, 198 round trips, 199-200 data entry forms, 196-197 application creation project, 221, 223-227 CDATA, 204 client application, 225-227 empty value processing, 202-203 errors, 232-233 NULL value processing, 202-203 performance, 230-231 privacy, 229-230 project, 236 reliability, 231-232 security, 228-229 shortcuts, 199-208 templates, 233 third-party products, 206-208 vehicle choice, 197-199 WSDL files, 200-202 data flow, 99-100 data manipulation, remote access and, 163 data security issues, 66-67 data sharing, 12 data transfer, 35, 52-57 formatting loss, 177-178 HTTP and, 39-40 requirements, messages and, 39 data transfers, DCOM, 16-18 data types, 346 application, testing, 254 complex, 245-254, 349-351 conversion, 245-247 conversions, 346, 353-354 databases, validation, 244 implementation, 351-352 inline data typing, 352 method calls, migrating to SOAP, 124 overview, 346-349
translation, 240 XML, dateTime, 348 databases arrays, dynamic arrays, 262 component simplicity, 256-257 connectivity, 240 data entry form application, 223 data type translations, 240 data types complex, 245-254 validation, 244 data viewers creating client, 151-153 separating data viewing logic from main com-ponent, 148-151 server-side component, 147-148 updating, 145 forms and, 242 limitations of SOAP, 242-243 migrating, 153-155 multi-tier component, 263-266 order entry, 243 parsing techniques, 244 queries, 243 remote access, 238-239 client-side application, 267-272 interface, 249-251 remote client creation, 252-253 server-side component, 251-252 shortcuts, 243-245 testing application, 272-274 transactions, 278-279 troubleshooting application, 274-277 uses/concerns, 239-243 reports, 243 round trips, 199-200 scalability, 240 security, remote access, 240 server-side component, 255-262 code generation, 259-263 multiple, 258-259 SQLXML, 257-258 SQL Server, defining, 255 SQLXML, 257-258 state and, 154 survey forms, 210 surveys, 242 three-tier programming model, 146-147 toolkits, capacity, 244 transaction support, 240 transactions, remote access, 278-279 troubleshooting, 279-281
date data type, 347 DCE (Distributed Computing Environment), 15 DCOM (Distributed Component Object Model), 126, 240 data transfer, 16-18 interoperability and, 171 migrating database applications, 153-155 from SOAP, 123 server-side components, 141142 SOAP and, 15-18 SOAP replacing, 46 DCOM Wire Protocol, 15-18 debuggers, scripts, Web-based applications, 314-316 debugging applications, 110-111 decimal data type, 347 declarations, complex data types, 349 design basics, applications, 99-100 design tools, schemas as, 183 detail, fault messages, 43 developerWorks Web site, 123 development application modules, integrating, 134-136 IDE, enhancing efficiency of, 138 research tools, 137-138 DevelopMentor, 14 DISCO (Discovery of Web Services), 48 discovery protocols, 47 display, PDAs, 338-341 distributed application use, 14 distributed applications client/server model and, 146-147 compared to client/server, 127 security, 66 Distributed Object Component Model. See DCOM DLLs (dynamic link libraries) namespaces and, 102-104 SOAP Toolkit and, 84-85 DNS (Domain Name Service), 65 document style WSDL files, 200, 202
429
430
DOM (document object model)
DOM (document object model), 205 compression and, 239 XML parsing and, 205 DOS (denial of service) attacks, 259 double data type, 347 downloading graphics, 163 SOAP toolkit, PDAs, 326-329 DSI (Digital Signatures Intiative), 68 DSOM (Distributed SOM), 19 dynamic arrays, databases and, 262
E eBay, 123 ebXML (Electronic Business eXtensible Markup Language), 47 EDI (Electronic Data Interchange), 357 elements, attributes and, 118-119 employee check-in application, remote access, 182-192 empty values data entry forms, 202-203 surveys, 202-203 encryption, 67 endBody( ) method, 103
float data type, 348 flowcharts, BizTalk Orchestratin Designer, 366 foreign language code, encoding requirements, 130 formats native, 163 StarOffice, 198 formatting data entry vehicles and, 197 data transfer, loss, 177-178 data vehicles, XML and, 198 forms See also surveys data entry, 196-197 vehicle choice, 197-199 databases and, 242 surveys, 208-221 Web-based applications, 286 functions API, OpenSCManager( ), 176 CreateObject( ), 301 EnumServicesStatus( ), 176 EnumServiceStatus( ), 176 GetAllNames( ), 295-296 GetComplexString, 251 GetCompName( ), 295 GetComputerNameEx( ), 292-293 ParseData( ), 268 Replace( ), 182
G GetAllNames( ) function, 295-296
envelopes, messages, 22
GetComplexString method, 251
error handling, 116-117, 131-134 Visual C++, 404-406 Web-based applications, 303-307
GetCompName( ) function, 295
error numbers, 133
GetComputerNameEx( ) API function, 292-293
F fault messages, 41-44 categories, 42 faultcode, fault messages, 43 faultactor, fault messages, 43 faultstring, fault messages, 43 file formats, native, 163
GUIDGen, 250 GUIDs, GUIDGen and, 250
form views, thin clients, 298-299
EnumServicesStatus( ) function, 176
errors data entry forms, 232-233 surveys, 232-233
GUID (globally unique identifier), 126, 250 components and, 53
GetComputerName( ) API function, 293
GetObject( ) method, 187 GetStatus( ) method, source code, 173-176
H hackers, 62 headers, 39 processing, 170-171 HTTP (Hypertext Transfer Protocol), 12, 22-23 data transfer and, 39-40 databases, 240 message portion, 37-38 HTTP Authentication Framework, 71-73 HTTP Authentication Framework RFC2617, 69 HTTP Extension Frameworks, 44-45 HTTP timeouts, PDAs, 332 HTTP wrappers messages, 35 response header portion, 38 human language support, Webbased applications, 316
I IBM Web Services Toolkit, 311-312 IDE (integrated development environment), enhancing efficiency of, 138 IDL files, method attributes, importance of, 398-399 IdooXoap Web site, 391 IdooXoap, PDAs, 328 IETF (Internet Engineering Task Force), 68
GetTasks( ) method, 262
IIOP (Internet Inter-ORB Protocol), 19-20 SOAP replacing, 46
GIF (Graphic Interface Format) images, 163
IIS (Internet Information Server), paths, spaces in, 180
graphics, downloading, 163
images, GIFs, 163
GSS-API (Generic Security Service Application Program Interface), 69
implementation data types, 351-352 problems, 53-55
migrating
inline data typing, 352 input applications, surveys, testing, 214-216 installation 4S4C toolkit, 393-394 listeners, 51 int data type, 348 integer data type, 348 integrated development environment. See IDE integration, application modules, 134-136 interfaces, databases, remote access, 249-251 international applications, support for, 130 Internet, security standards, 68-70 Internet-only applications, 45 interoperability, 171 online information about, 134 IPSec (Internet Protocol Security Protocol), 69 ISAPI Listener, 179-181
J Java RMI (Remote Method Invocation), 163 SOAP and, 20-22 Java RMIJScript, PDAs, 330 JVMs (Java Virtual Machines), 21
K
check-in one component source code, 184-186 cmdRemote_Click( ) source code, 181-182 Complex Type Client Code for PDA, 329-335 CompNameProc source code, 295-296 Data exchange routines for client application, 269-271 Data Survey Input client Source Code, 213-214 Data Survey Input Component Source Code, 211-212 Data-Viewer Client Code, 151-153 GetStatus( ) method source code, 174-176 IDL File Defining Visual Basic Component Interface, 249-250 Movement routines for client application, 272 OnTest Event Handler Code, 402 Remote Client Source Code, 252-253 remote test client source code, 190-191 Server Information Component, 397-398 Server-Side Component for SOAP Use, 148-151 Server-Side Database Component Accessible from DCOM or SOAP, 260-262 Thick Client source code, 297 Utility routines for clientapplication, 267-268 Web Page Thin Client source code, 300-301
Kerberos Network Authentication Service, 70
live data Web-based application, 302-303
kSOAP, PDAs, 328
local components, migrating, 142-144
L LANs (Local Area Networks), 12 libraries ComplexType 1.0 Type Library, 251 namespaces and, 102-104 listeners, 86 designing, 105-106 installation, 51 listings Analysis Component for Food Survey Source Code, 217-218
long data type, 348 Low-level API, 86-87
M M-POST, 37 maintenance, remote access and, 164 Masker 2.0 data encryption tool, 374-376 memory, Web-based applications, 312-317
Message Validation Web site, 40-41 messages, 35-37 data transfer requirements, 39 decoding, 54 envelopes, 22 fault messages, 41-44 categories, 42 header, 39 headers, POST action, 37 HTTP portion, 37-38 HTTP wrapper, response header portion, 38 HTTP wrappers, 35 response header, error status, 38 XML and, 35 XML portion, 38-39 XML wrappers, 35 method calls, data types, migrating to SOAP, 124 methods attributes, importance of, 398-399 ClearLogEntries( ), 188 cmdCompleted_Click( ), 271 cmdMakeEntry_Click( ), 190 cmdTask_Click( ), 271 endBody( ), 103 GetComplexString, 251 GetObject( ), 187 GetStatus( ), source code, 173-176 GetTasks( ), 262 mssoapinit( ), 108 SetComplete( ), 262 startBody( ), 103 Microsoft BizTalk, 356-359 Microsoft Installer Web site, 393 Microsoft Queued Components, 35 Microsoft SOAP Toolkit. See SOAP Toolkit migrating applications performance concerns, 158-159 problems with, 155 reliability concerns, 156-157 security concerns, 157 data types, 124 data viewer applications, 145 creating client, 151-153 separating data viewing logic from main component, 148-151 server-side component, 147-148 databases, 153-155
431
432
migration
migration application modules changing, 126-127 integrating, 134-136 prioritizing development, 128-129 block diagrams, 124 COM language binding, 136-137 DCOM to SOAP, 123 development research tools, 137-138 distributed application concepts, 127-128 eBay and, 123 planning, 123-126 protocols, 129-130 error handling, 131-134 locating problems, 130-131 testing, 125 troubleshooting, 142, 159-160 utility applications, 140 creating local components, 142-144 updating client, 144-145 updating server-side components, 141-142 VB projects, Project properties dialog box, 141 WSDL files, trouble-shooting applications, 138-140 MIME (Multipurpose Internet Mail Extensions), 57-58, 163 applications and, 234-236 attachments, 172
N Namespace Identifier (NID), 126 namespaces, 22, 102-104 WSDL, 49 XML-RPC, 56 naming, URIs and, 126 native formats, 163 networks, PDAs, 325 NID (Namespace Identifier), 126 NULL values data entry forms, 202-203 surveys, 202-203
O OMG (Object Management Group), CORBA and, 19 ONE (Open Network Environment), 19 OpenOffice.org, 198 OpenSCManager( ) API function, 176 operating systems, PDAs, 325-326 order entry, databases, 243 OSF (Open Software Foundation), 16 output applications, surveys, 218-221
modems, PDAs, 323, 325 monikers, 18
P
client-side component, Computer Name example, 333-335 color, 340 Complex Type Example, 329-332 client-side component, 329-331 Computer Name example, 332-338 development need, 323-324 display issues, 338-341 HTTP timeouts, 332 IdooXoap, 328 implementation differences, Complex Type Example, 331 JScript, 330 kSOAP, 328 modems, 323, 325 networks, 325 operating systems, 325-326 pocketSOAP, 327 pointers, 340-341 screen size issues, 339-340 security issues, 341-343 server-related differences, 337-338 SOAP toolkit, downloading, 326-329 SOAP::Lite Site, 326 special needs for, 323-326 synchronization, 325 Trace Utility (MSSoapT), 335-336 troubleshooting, 343-344 Windows CE, 325-326 PDAs (Personal Digital Assistants), 322-323 SOAP Toolkit and, 88
monitoring remote access and, 163 security monitoring, 65
ParseData( ) function, 268
monitoring programs, 46
partner access, applications, 45
MSI (Microsoft Installer) Web site, 393
passing configuration strings as arguments, 167
mssoapinit( ) method, 108
paths, spaces (IIS), 180
performance applications, 117-120 issues, 26-27 surveys/data entry and, 230-231
MSSOAPR.DLL file, 85
payload protection, security, 310-311
Personal Digital Assistants. See PDAs
PCT (Private Communication Technology), 69
phone book applications, remote access, 164-165
PDAs add-ons, 324 applications Complex Type Example, testing, 331-332 Computer Name example, testing, 338 bar-code scanners, 323
platforms, application differences, 98
MSSoapT (Trace Utility), PDAs, 335-336 multi-tier component, databases, 263-266 Must Understand message type, 42 MZTools (VB), 376-380 Web site, 138
parsing techniques, databases, 244
PEM1 (Privacy Enhanced Mail Part 1), 69
pocketSOAP, PDAs, 327 pointers, PDAs, 340-341 POST action portion, message header, 37
security
privacy data entry forms, 229-230 issues, 65-66 surveys, 229-230
RegSvr, registering application components, 399
RPC (Remote Procedure Call), 35, 86-87
reliability, surveys/data entry forms, 231-232
RPC style WSDL files, 200-202
problem solving tools, 373-374
remote access COM+ components and, 167 configuration and, 164 data manipulation, 163 databases, 238-239 client-side application, 267-272 interface, 249-251 multi-tier component, 263-266 remote client creation, 252-253 server-side component, 255-263 shortcuts, 243-245 testing application, 272-274 transactions, 278-279 troubleshooting application, 274-277 uses/concerns, 239-243 maintenance, 164 phone book application, 164-165 services and, 164 streams and, 167 task scheduling and, 164
problems solved by using SOAP, 24-25 processing components, Webbased applications, thin clients, 295-297 productivity tools, 372-373 programming code, foreign language encoding requirements, 130 projects, 120 data entry forms, 236 surveys, 236 protocols binary protocols, 25 discovery, 47 early computers and, 12 migrating applications, 129-130 error handling, 131-134 locating problems, 130-131 transport, 35 proxies thin clients, 290-291 VBWS Proxy Generator, 290-291 proxy servers, 25 psWSDL Wizard, 200, 247-248, 381-383
Q QOP (quality of protection), 72 Que Web site , 141
remote access utilities, 162, 193-194 applications, 163-165 components, existing, 168 employee check-in application, 182-192 flexibility, 167 monitoring and, 163 non-issues, 170-172 overview, 162-172 security issues, 169-170 server status viewer, 172-182 Web Services, 165-166 WSDL files and, 188-190
queries, databases, 243
remote client, database remote access, 252-253
Queued Components (Microsoft), 35
Replace( ) function, 182
R RDDL (Resource Directory Description Language), 126 registration application components, RegSvr32, 399 scripts, local components, 144 registry entries avoiding ambiguity, 144 initialization process and, 167
reports, databases, 243 research tools, 137-138 Resource Directory Description Language. See RDDL resources, Web-based applications, 312-317 RMI (Remote Method Invocation), 15 See also Java RMI ROAP.DLL, 84 round trips, data entry and surveys, 199-200
S S/MIME (Secure Multipurpose Internet Mail Extensions), 69, 73-74 security issues, 309-310 S/WAN (Secure/Wide Area Network), 70 S2ML (Security Services Markup Language), 65, 70 satellite applications, 46 scalability, databases, 240 schemas, as design tools, 183 SCL (SOAP Control Language), 48 BizTalk and, 358 SCM (service control manager), 176 screen size, PDA issues, 339-340 scripting errors, Web-based applications, 304-305 scripts, debugging Web-based applications, 314-316 SDL (Service Description Langauge), 84 security, 46, 62-63 application development, integrating modules, 135 biometrics, 75-77 case study, 78-79 components, Web-based applications, 305 crackers, 45 data entry forms, 228-229 data security issues, 66-67 databases, remote access, 240 distributed applications, 66 DOS attacks, 259 encryption, 67 error handling and, 116 HTTP Authentication Framework, 71-73 Internet standards, 68-70 issues, 64-65 migration considerations, 157 monitoring, 65 payload protection, 310-311 PDA issues, 341-343 remote access utilities, 169-170 S/MIME, 73-74, 309-310
433
434
security
smart cards, 75-76 SSL, 74-75 standards, 67-75 surveys, 228-229 two-way communications and, 310 user identification issues, 75-77 user privacy issues, 65-66 vendors, 77-78 Web Services and, 165 Web-based applications, 307-312 WSML file, 180 serializer, application creation, 101 server message type, 42 server status viewer, 172-177, 179-182 client component, 181-182 server-side component, 173, 176-177 Windows API and, 173 server-side code, applications, 104-106 server-side components, 51 applications, 104-105 data entry form application project, 223-225 data viewers, 147-148 database remote access, 251-252, 255-262 code generation, 259-263 multiple, 258-259 migrating, 141-142 separating data viewing logic from main component, 148-151 server status viewer, 173, 176-177 survey form, 210-212 Web-based applications, thin clients, 292-295 servers application components, activating with 4S4C, 399 PDAs, differences, 337-338 proxy servers, 25 service information, polling, 173 status, working with information, 177-179 testing, 113-116 Web servers, 28 service information, polling servers for, 173 Service is in Use error, Webbased applications, 305
services, 47 DISCO, 48 remote access and, 164 UDDI, 49-50 WSDL, 48-49 Services MMC snap-in, 177 ServicesDeclarations.BAS file, 173 SetComplete( ) method, 262 short data type, 348 shortcuts application creation, 100-102 data entry forms, 199-200, 202-208 database remote access, 243-245 surveys, 199-200, 202-208 SHTTP (Secure Hypertext Transfer Protocol), 70 smart cards, 75-76 SMO (SOAP Messaging Object), 86-87 local components, creating, 143 migration from DCOM, 125 SOAP (Simple Object Access Protocol), 12 contributors to, 14 CORBA and, 15, 19-20 DCOM and, 15-18 HTTP and, 22-23 implementation resources Web site, 130 Java RMI and, 20-22 migrating from DCOM, 123 overkill, 166-167 overview, 12-15 problems solved by using, 24-25 specification problems, 134 state and, 128 theory of, 34-35 XML and, 13, 22-23 SOAP 1.1 Reference Web site, 137 SOAP Messaging Object Generator, 94-97 SOAP Messaging Object. See SMO SOAP Toolkit (Microsoft), 54-55 DLLs, 84-85 downloading for PDAs, 326-329 examples on two machines, 91 interoperability, 88 limitations, 98 schemas, 153 Low-level API, 86-87
Messaging Object Generator, 94-97 overview, 83-98 PDAs and, 88 problems with, 87-89 programming language support, 390 RPC (Remote Procedure Call), 86-87 SMO (SOAP Message Object), 86-87 variables, type information, 89 Visual C++ and, 87 WSDL Generator, 92-94 SOAPAction entry, 37 SOAP::Lite Site, 326 SOM (System Object Model), 19 source code cmdRemote_Click( ) method, 181-182 GetStatus( ) method, 173-176 specification, problems with, 134 speed, 27 SQL Server, defining databases, 255 SQLXML, 257-258 SSL (Secure Sockets Layer), 46, 70, 74-75 Web-based applications, 311 StarOffice, file formats, 198 startBody( ) method, 103 state, 128 database applications and, 154 status, servers viewer, 172-182 working with information, 177-179 streams, remote access and, 167 string data type, 348 strings, configuration, passing as argument, 167 surveys, 196-197 analysis component, 216-218 application processing, 221 CDATA, 204 controls, 203 databases, 210 databases and, 242 empty value processing, 202-203 errors, 232-233 form creation, 208-221
varibles, type information, SOAP Tookit and,
forms designing, 212-214 service-side component, 210-212 input application, testing, 214-216 NULL value processing, 202-203 output applications, designing, 218-221 output testing, 221 performance, 230-231 privacy, 229-230 project, 236 reliability, 231-232 round trips, 199-200 security, 228-229 shortcuts, 199-208, WSDL files, 200-202
toolkits, 53-54 See also SOAP Toolkit (Microsoft) 4S4C creating clients, 400-404 creating server-side components, 395-400 creating WSDL files, 395-396 features, 391-392 installing, 393-394 overview, 391 Web site, 390 capacity, databases, 244 IdooXoap Web site, 391 Web sites, 390 White Mesa Web site, 391
task scheduling, remote access and, 164
tools Masker 2.0, 374-376 MZTools (VB), 376-380 problem solving, 373-374 productivity, 372-373 psWSDL Wizard, 381-383 tcpTrace, 383-384 third-party, 372-388 XML Spy, 384-388
tcpTrace, 383-384
toss-away code, 172
tcpTrace tool, 110-111
Trace Utility (MSSoapT), PDAs, 335-336
synchronization, PDAs, 325
T
telecommuters, 45 telephones, embedded SOAP applications and, 341 templates, data entry forms, 233 testing application modules, 134 applications, migrating to SOAP, 125 thick clients, Web-based application design, 297-298 thick clients, using with thin, 291 thin clients form view clients, 298-299 processing components, 295-297 proxies, 290-291 server-side components, 292-295 thick clients, using simultaneously, 291 Web page design, 299-301 Web-based applications, 289-301 third-party tools, 372-388 three-tier programming model, 146-147 time data type, 348
tradeoffs in performance, 26 transactions databases, 240 remote access, 278-279 support for, 131 transferring data, 52-57 translations, data types, 240 transport protocols, 35 troubleshooting applications, database remote access, 274-277 databases, 279-281 migrating applications, 159-160 migration, 142 migration problems, WSDL files, 138-140 PDAs, 343-344 tutorials, 123
U UDDI (Universal Description, Discovery, and Integration), 15, 49-50 BizTalk and, 357 registry, 164 specification, 136
UDTs (user defined types) complex data types, 248 SOAP Toolkit and, 87 uniform resource identifier (URI), 37, 70, 126 uniform resource name (URN), 126 unique identifiers, types of, 126 Universal Description, Discovery, and Integration specification. See UDDI updategrams, SQLXML, 258 upgrades, SOAP Toolkit, 91 URI (Uniform Resource Identifier), 37, 70, 126 URN (uniform resource name), 126 user privacy issues, 65-66 users, identification issues, 75-77 UTF (Unicode Transformation Format), 38 utilities BizTalk Editor, 359-362 BizTalk Mapper, 363-365 BizTalk Orchestration Designer, 365-368 migrating, 140 creating local components, 142-144 updating client, 144-145 updating server-side components, 141-142 remote access, 162, 193-194 applications, 163-165 existing components and, 168 flexibility, 167 non-issues, 170-172 overview, 162-172 security, 169-170 server status viewer, 172-182 Web Services, 165-166 WSDL files and, 188-190 remote access employee checkin application, 182-192 WSDL files, creating, 141 WSDLGen, 247
V validation, data types, databases and, 244 variables, type information, SOAP Toolkit and, 89
435
436
VB (Visual Basic)
VB (Visual Basic) IDE, enhancing, 138 migrating projects, Project Properties dialog box, 141 MZTools, 376-380 SOAP Toolkit support, 390 VBWS Proxy Generator, 290-291 thin clients, 290-291 vendors, security, 77-78 VeriSign, 74 VersionMismatch message type, 42 viewers, server status, 172-182 client component, 181-182 server-side component, 173, 176-177 Visio, BizTalk and, 357 Visual Basic. See VB Visual C++ ATL COM AppWizard projects, creating, 396-398 error handling, 404-406 SOAP Toolkit and, 87 SOAP Toolkit support, 390 toolkit Web sites, 390
W WANs (Wide Area Networks), 12 Web pages design, thin clients, 299-301 dynamically created, 37 Web servers, 28 Web Services remote access utilities, 165-166 security, 165 Web Services Description Language. See WSDL Web sites 4S4C, 390 acronym identification, 127 foreign language programming encoding requirements, 130 IdooXoap, 391 inter, 134 Message Validation, 40-41 Microsoft Installer, 393 MZ-Tools, 138 Que, 141 Resource Directory Description Language, 126 SOAP 1.1 Reference, 137 SOAP implementation resources, 130 SOAP specification problems, 134
toolkits, 390 tutorials, 123 White Mesa, 391 WSDL Generator, 141
document style, 200-202 input application testing, 214 remote access utilities and, 188-190 RPC style, 200-202 troubleshooting applications, 138-140 utilities for creating, 141 limitations of, 136
Web-based applications, 284-285 ActiveX problems, 306-307 ASP, 316-317 browser compatibility, 287 case study, 317-319 challenges, 287-289 component communication, 316-317 component interactions, 313-314 components, security, 305 connection loss, 304 error handling, 303-307 human language support, 316 IBM Web Services Toolkit, 311-312 live data, 302-303 payload protection, 310-311 processing component, thin clients, 295-297 resource issues, 312-317 S/MIME, 309-310 script debuggers, 314-316 scripting errors, 304-305 security, 307-312 server-side component, thin clients, 292-295 Service is in use error, 305 SSL, 311 thick clients, 297-298 thin clients, 289-301 form views, 298-299 processing component, 295-297 server-side component, 292-295 uses for, 285-287 XHTML, 286
XDR (XML Data Reduced) files, 356
WebServices Center Web site, 127
XML Library, 85
White Mesa Web site, 391
XML Parser 3.0 files, SOAP Toolkit, 91
Windows API, server status viewer and, 173 Windows CE, 325-326 Windows Internet Connector Library, 86
WSDL Generator complex data types, 247-248 date variables, 263 SOAP Toolkit, 92-94 SOAP Toolkit and, 90 Web site, 141 WSDLGen utility, 247 WSML file, security, 180
X XHTML (eXtensible Hypertext Markup Language), 286 XML (eXtensible Markup Language), 13, 22-23 See also ebXML CDATA and, 204 document transmission restrictions, 204-205 formatting and, 198 messages, 38-39 messages and, 35 SOAP and, 13 Tim Bray, 198 wrappers, 35 XML Cover Pages, 50
XML Spy tool, 384-388 XML Technology Protocol Reference Web site, 127 XML-RPC, 52, 56
Winer, Dave, 346
xmlns tag, 22
wizards, psWSDL Wizard, 200, 247-248
XMLP (XML Protocol), 52, 56-57
WSDL (Web Services Description Language), 15, 48-49, 352 BizTalk and, 358 files creating with 4S4C toolkit, 395-396
XRML (eXtensible Rights Markup Language), 68 XSD (XML Schema Definition), 55 XSI (XML Schema for Interfaces), 55