This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
php|Tropics Moon Palace Resort, Cancun, Mexico. May 11-15 2005
Come learn PHP in Paradise with us (and spend less than many other conferences)
Ilia Alshanetsky - Accelerating PHP Applications, Marcus Boerger - Implementing PHP 5 OOP Extensions, John Coggeshall - Programming Smarty, Wez Furlong - PDO: PHP Data Objects, Daniel Kushner - Introduction to OOP in PHP 5, Derick Rethans - Playing Safe: PHP and Encryption, George Schlossnagle - Web Services in PHP 5, Dan Scott - DB2 Universal Database, Chris Shiflett - PHP Security: Coding for Safety, Lukas Smith - How About Some PEAR For You?, Jason Sweat - Test-driven Development with PHP, Andrei Zmievski PHP-GTK2 For more information and to sign up: http://www.phparch.com/tropics Early-bird discount in effect for a limited time!
At php|tropics, take the exam and The Magazine For PHP Professionals
Get Zend Certified ...and we'll pay your fees!
TABLE OF CONTENTS
php|architect
TM
Departments 6
Features 10
Editorial
by Christian Wenz
I N D E X
It’s All in a Day’s Work
7
19
What’s New!
Crunching Data with PHP
Turning a Class Into an Application With PHP-GTK by Scott Mattocks
42
Test Pattern The Three Inch High Design Tool by Marcus Baker
28
Interview Priming PHP for the Enterprise by Marco Tabini
48
Product Review Vertrigo: The Utopia of All-in-One’s ? by Peter B. MacIntyre
62
33
Strengthening the Authentication Process by Graeme Foster
Security Corner Magic Quotes by Chris Shiflett
52 65
An XML approach to Templating using PHPTAL by José Pablo Ezequiel Fernández Silva
exit(0); HELP! I’m a PHP beauty stuck in the body of this Java programmer! by Marco Tabini
March 2005
●
PHP Architect
●
www.phparch.com
4
EDITORIAL
php|architect
It’s All in a
E D I T O R I A L
R A N T S
Day’s Work
TM
Volume IV - Issue 3 March, 2005
Publisher Marco Tabini
I
t’s no wonder that we are getting paranoid about the security of air travel—airports and airplanes seem to be a breeding ground for odd and bizarre behaviour. For some unknown reason, the normal laws of civilized society don’t seem to apply over international waters, or as soon as you’ve passed the first (of many) security checkpoints. On a flight, you’re forced to be in closer contact than you would ever allow under any circumstances with people you have never met in your life—and, most likely, would never want to have anything to do with if you knew them in the first place. Some of your fellow passengers are just plain inconsiderate—like the guy sitting next to you who takes off his shoes and the other one who drinks enough Martinis to kill a small horse. Airport security—not to be outdone by the very same people it is meant to server—is reaching new heights of stupidity. At one end of the line, an officer asks you to take off your shoes. “It’s optional, but if you don’t take them off they’ll search you at the other end of the line.” Well, duh… let’s see, should I take off my shoes now or everything else in the presence of that seven-feet-tall guard named “Bob” in thirty seconds? Um, let me think about it. On my way back from a recent trip to California, I sat right behind the security checkpoint and listened in on a screener who was performing a search on a fellow passenger-in-waiting who had actually refused to take off his shoes. The best part was the introduction, which went something like “Sir, could you step to the side please. Now, I will have to perform a search of your person because you ‘fit the profile.’ Of course, we can’t tell you what the profile is, but this will only take a moment.” So, on one side of the line someone tells you exactly “what the profile is,” and, on the other, someone else tells you that the profile is a secret. I hate to be stating the obvious, but that strikes me as slightly odd—then again, there is no limit to government silliness. Meanwhile, back in Canada our government is training more search dogs and pigs (yes, I said pigs) to sniff out smugglers. “Drugs,” you may be thinking? No. Illegally-imported food. It’s not the guy with the three-pound package of cocaine in his backpack that we should be worried about—the real criminal is the eighty-year-old Italian lady with the salami in her purse. Until next month, happy readings!
Editorial Team Arbi Arzoumani Peter MacIntyre Eddie Peloke
Authors Marcus Baker, Graeme Foster, Peter B. MacIntyre, Scott Mattocks, Chris Shiflett, José Pablo Ezequiel Fernández Silva, Christian Wenz
php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O. Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada. Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes no responsibilities with regards of use of the information contained herein or in all associated material.
php|architect launches php|tropics 2005 Ever wonder what it's like to learn PHP in paradise? Well, this year we've decided to give you a chance to find out! We're proud to announce php|tropics 2005, a new conference that will take place between May 11-15 at the Moon Palace Resort in Cancun, Mexico. The Moon Palace is an allinclusive (yes, we said all inclusive!) resort with over 100 acres of ground and 3,000 ft. of private beach, as well as excellent state-of-the-art meeting facilities. As always, we've planned an in-depth set of tracks for you, combined with a generous amount of downtime for your enjoyment (and your family's, if you can take them along with you). We even have a very special early-bird fee in effect for a limited time only. For more information, go to http://www.phparch.com/tropics.
ZEND Core for IBM Zend Core for IBM is a complete, certified and fully supported distribution of PHP 5 that tightly integrates with IBM's DB2 and CloudScape products, in addition to bundling all required third-party libraries for interaction with the outside world. The product includes such features as security updates, GUI-based management, granular control over configuration parameters and compatibility with Zend's other products, including Zend Platform. Zend Core will be available as a free download from both IBM's and Zend's websites in the second quarter of 2005. Support programs and Sevice Level Agreements will also be available for commercial clients in a variety of different configurations. For more information, visit the Zend Core for IBM site (http://www-306.ibm.com/software/data/info/zendcore/).
phpBlog 2.0.1
Zend Studio 4.0
Want to get into the world of blogging? Are you currently running phpBB? If so, check out the latest release of phpBlog 2.0.1. The project’s hompage lists some of its features as:
Zend has announced the release of Zend Studio 4.0 ( http://www.zend.com/store/products/zend-stu dio.php) Zend Technologies Inc. introduced Zend Studio 4.0, a new version of their PHP integrated development environment (IDE). Zend Studio runs on multiple operating systems including Mac OS X. The new release includes integrated support for all major database servers, according to the developer, including IBM DB2, Cloudscape, MySQL, Oracle, MS SQL Server, PostgreSQL, Derby and SQLite. New syntax highlighting works for XML and CSS -- previously PHP, HTML, XHTML and JavaScript were supported. PHPDocs support has been added and PHPDocumentor now lets users create documentation directly from the PHP project source code. Zend Studio 4 comes in a Standard edition for US$99 and a Professional edition for $299. Both prices include tech support and one year of updates and upgrades. For more information visit: http://www.zend.com/
• • • • •
Trackbacks Montly archives Miniblog Rss More…
For more information or to download, visit
http://www.outshine.com/phpb bblog/
March 2005
●
PHP Architect
●
www.phparch.com
7
NEW STUFF
Maguma OpenStudio Maguma GmbH (Bolzano, Italy) will make the source code of Maguma Studio, Maguma's Windowsexclusive IDE, open! Beginning in March 2005 the full source code of Studio will be available for download and community participation. Maguma OpenStudio, as Maguma has named the product, is a milestone in the pursuit to the realization of Maguma's Open Source strategy. Maguma OpenStudio is a fast, easy and effective PHP IDE for beginners and professional developers alike. The newest product, the modular cross-platform IDE, Maguma Workbench, is Maguma’s second generation IDE and is also community focused through its flexibility to allows users to create custom modules for it. Maguma’s goal is to allow programmers to "Have Fun Programming!" In March Maguma OpenStudio will be available for download on the Community site www.phpwizard.net and on the Maguma Community site community http://community.maguma.org/ . For more information visit: http://maguma.org
Check out some of the hottest new releases from PEAR.
DB_DataObject_FormBuilder 0.11.4 DB_DataObject_FormBuilder will aid you in rapid application development using the DB_DataObject and HTML_QuickForm packages. In order to have a quick but working prototype of your application, simply model the database, run DataObject's createTable script over it and write a script that passes one of the resulting objects to the FormBuilder class. The FormBuilder will automatically generate a simple but working HTML_QuickForm object that you can use to test your application. It also provides a processing method that will automatically detect if an insert() or update() command has to be executed after the form has been submitted. If you have set up DataObject's links.ini file correctly, it will also automatically detect if a table field is a foreign key and will populate a selectbox with the linked table's entries. There are many optional parameters that you can place in your DataObjects.ini or in the properties of your derived classes, that you can use to fine-tune the form-generation, gradually turning the prototypes into fully-featured forms, and you can take control at any stage of the process.
DB 1.7.1 DB • • • • • • • • • • • • •
is a database abstraction layer providing: An OO-style query API Portability features that make programs written for one DBMS work with other DBMS's A DSN (data source name) format for specifying database servers Prepare/execute (bind) emulation for databases that don't support it natively A result object for each query response Portable error codes Sequence emulation Sequential and non-sequential row fetching as well as bulk fetching Formats fetched rows as associative arrays, ordered arrays or objects Row limit support Transactions support Table information interface DocBook and phpDocumentor API documentation
Cache_Lite 1.4.1 This package is a little cache system optimized for file containers. It is fast and safe (because it uses file locking and/or anti-corruption tests).
XML_RPC 1.2.1 A PEAR-ified version of Useful Inc's XML-RPC for PHP. It has support for HTTP/HTTPS transport, proxies and authentication.
I18Nv2 0.11.3 This package provides basic support to localize your application, like locale based formatting of dates, numbers and currencies. Beside that it attempts to provide an OS independent way to setlocale() and aims to provide language, country and currency names translated into many languages.
March 2005
●
PHP Architect
●
www.phparch.com
8
NEW STUFF
eZ publish 3.5.1 Ez.no announces the latest release of their content management system. From the announcement: ”eZ publish is an open source content management system and development framework. As a content management system (CMS) its most notable feature is its revolutionary, fully customizable, and extendable content model. This is also what makes it suitable as a platform for general Web development. Its stand-alone libraries can be used for cross-platform, database independent PHP projects. eZ publish is also well suited for news publishing, e-commerce (B2B and B2C), portals, and corporate Web sites, intranets, and extranets. eZ publish is dual licenced between GPL and the eZ publish professional licence.” Get all the details from http://ez.no/
The Zend PHP Certification Practice Test Book is now available! We're happy to announce that, after many months of hard work, the Zend PHP Certification Practice Test Book, written by John Coggeshall and Marco Tabini, is now available for sale from our website and most book sellers worldwide! The book provides 200 questions designed as a learning and practice tool for the Zend PHP Certification exam. Each question has been written and edited by four members of the Zend Education Board--the very same group who prepared the exam. The questions, which cover every topic in the exam, come with a detailed answer that explains not only the correct choice, but also the question's intention, pitfalls and the best strategy for tackling similar topics during the exam. For more information, visit http://www.phparch.com/cert/mock_testing.php
Looking for a new PHP Extension? Check out some of the lastest offerings from PECL.
big_int 1.0.0 Functions from this package are useful for number theory applications, or example in two-key cryptography. See tests/RSA.php in the package for example of implementation of RSA-like cryptoalgorithm. The package has many bitset functions, which make it possible to work with arbitrary-length bitsets. This package is much faster than the one bundled into PHP BCMath and covers almost entirely the functions implemented in the PHP GMP extension without requiring any external libraries.
Net_Gopher 1.0.0 An fopen() wrapper for retrieving documents via the gopher protocol. It includes additional function for parsing gopher directory entries.
bz2_filter 1.1.0 A bzip2 compress/decompress stream filter implementation. It performs inline compression/decompression using the bzip2 algorithm on any PHP I/O stream. The data produced by this filter, while compatible with the payload portion of a bz2 file, does not include headers or trailers for full bz2 file compatibility. To achieve this format, use the compress.bzip2:// fopen wrapper built directly into PHP.
intercept 0.2.0 Allows the user to request that a user-space function be called when a PHP function is executed. Support for class/object methods will be added later.
mailparse 2.1.1 Mailparse is an extension for parsing and working with email messages. It can deal with rfc822 and rfc2045 (MIME) compliant messages.
March 2005
●
PHP Architect
●
www.phparch.com
9
FEATURE
Crunching F E A T U R E
Data with PHP by Christian Wenz
There are various file formats to archive, pack, zip or crunch data. PHP supports many of them, in different ways: using external PHP scripts, PEAR packages or PHP extensions.
W
hen it comes to transferring data using the Internet, trying to make your files as small as possible is often a key element. It is rather little known, however, that PHP supports a variety of archive formats, in various ways: PHP extensions that are compiled in (or loaded using php.ini settings or dl()), PEAR packages and other external scripts. This article surveys the most important and relevant possibilities in this area, always with short examples that are ready-to-use for your applications. PHP Extensions From a performance point of view, using a PHP extension is very often the best way to solve a problem. Since you’re dealing with compiled code, performance is usually much better than interpreted PHP code. However, not all of these extensions are updated on a frequent basis and some of them lack important features. But before judging, let’s first have a closer look. The file format that is probably most widely used over the Internet is the ZIP format, because it has been around for a long time and applications to manipulate it are widely available on all platforms. Recent versions of Windows come with an internal ZIP module, but do not support other formats out of the box; Linux distributions and Mac OS X offer much more in this respect.
March 2005
●
PHP Architect
●
www.phparch.com
Therefore, in order to avoid the hassle of additional software installation, using the ZIP format is a good idea. There is even a PHP module that supports ZIP— you can find it in the online manual at http://php.net/manual/en/ref.zip.php. The module is a wrapper for the ZZIPlib library, a SourceForge project available at http://zziplib.sf.net/. This library supports only extracting data from an archive, not creating new ZIP files. Therefore, it can only be used with existing ZIP files. Doing so, however, is relatively easy: first, you have to ensure that the PHP module is present. If you are building PHP by yourself, you have to run configure with the —with-zip=/path/to/zziplib switch; Windows users just need to add the following line to their php.ini file: extension=php_zip.dll
REQUIREMENTS PHP
4.x , 5.x
OS
Any
Other Software
The modules and packages referenced in the article.
Code Directory
crunch
10
FEATURE
Crunching Data with PHP
Under some PHP configurations, the gds32.dll file (which resides in the dll subdirectory of your PHP 4 installation) has to be copied into a directory that is in the system path, e.g. c:\windows\system32; under PHP 5, this DLL is located in the main installation directory. Afterwards, phpinfo() shows that the library is there. Then, accessing the data within the ZIP archive consists of a number of standard steps: • Open the ZIP archive with zip_open(). • Use zip_read() to iterate through the contents of the ZIP file. • Open one single file within the archive with zip_entry_open() . • Read its contents with zip_entry_read(). • To clean up, use zip_entry_close() and zip_close(). And here is how it’s done: an archive is opened and its contents are written to the current directory. Here is the relevant excerpt from the code: if ($zip = zip_open($archive)) { while ($entry = zip_read($zip)) { if (zip_entry_open($zip, $entry, ‘rb’)) { // Output file zip_entry_close($entry); } } zip_close($zip); echo ‘done.’; }
Listing 1 contains the complete code, including a PHP4-compatible version of file_put_contents() ,
which is a function available in its native form only in PHP 5. The library also contains some additional functions for gathering information about the files in the archive, including their size. In practice, being only able to extract data is a serious limitation; that’s why there are other classes that offer additional functionality. You will find them in the user comments on the ZZIPlib manual page and later on in this article. Another file format that, in many cases, achieves better compression ratios than ZIP is BZIP2, which even
Listing 1 1
phpinfo()
March 2005
●
PHP Architect
●
www.phparch.com
11
FEATURE
Crunching Data with PHP
offers a compression mode that takes longer, uses more memory, but ultimately generates even smaller files. Unfortunately, this functionality is not built-in in many operating systems and, therefore, only very few software packages are available exclusively in BZIP2. Of course, PHP supports BZIP2 by wrapping the bzip2 library available from http://sources.redhat.com/bzip2/. Unfortunately, this module is limited to compressing or decompressing individual files only. Therefore, it is viable for multiple files only if you first merge them into one tarball (more information for TAR support can be found later in this article; it does not come automatically with PHP). Also, installation of the library is required—then, a run of configure with the —with-bz2=/path/to/bzip2 switch, followed by a build will introduce BZIP2 functionality in your scripts. Windows users get the binary module php_bz2.dll as part of the official distribution, but have to explicitly load it using either dl() or by adding this line to their php.ini file: extension=php_bz2.dll
Using the library is easy, and only requires a small number of steps. For compressing data, the following procedure has to be followed: • Create a BZIP2 archive using bzopen(). • Use bzwrite() to successively write data to
the archive; the crunching is done automagically. • Finally, close the file with bzclose(). Here is a short version of this algorithm that reads in one file and writes it into a BZIP2 archive; Listing 2 contains the complete code. Note that file_get_contents() is binary-safe and, therefore, it can be used to retrieve the original file’s contents. Otherwise, you can use fopen() in ‘wb’ mode, iterate through the file and provide the data to bzwrite(). $infile = dirname(__FILE__) . ‘/php.ini-recommended’; $outfile = dirname(__FILE__) . ‘/test.bz2’; $out = bzopen($outfile, ‘wb’); bzwrite( $out, file_get_contents($infile) ); bzclose($out);
The way back, decompressing the files, is performed in a similar way. Here, bzread() is used to progressively read all the data out of the archive and decompressed into its original format. It can be stored in a buffer (and then written to the hard disk using file_put_contents()), or directly saved, piece-bypiece, using fputs(), as seen in the following snippet. Listing 3 contains the complete code. $in = bzopen($infile, “rb”); $out = fopen($outfile, “wb”); while ($data = bzread($in, 1024)) { fputs($out, $data, 1024);
Listing 2 1
Listing 3 1
ZIP Header
March 2005
●
PHP Architect
●
www.phparch.com
12
FEATURE
Crunching Data with PHP
• Create an archive using gzopen(). • Write data to the archive using gzwrite(). • Close the file using gzclose().
} bzclose($in); fclose($out);
The module offers a bit more flexibility if you use the bzcompress() function. This compresses a string provided in the first parameter, using the block size specified in the second parameter. The block size is a value between one and nine (included) and has a default value of 4. However, a value of nine gives the best compression, albeit at the cost of increased system resources during the data-crunching activity.
“Older versions of Netscape have a bug with embedded, compressed media, but do not have a reasonable market share any longer.” Finally, PHP supports Zlib, GNU’s ZIP library, which is available from http://www.gzip.org/zlib/. Again, installation is required: DIY-compilers have to configure PHP with the —with-zlib=/path/to/zlib , whereas Windows users have this functionality already built-in (starting with PHP 4.3.0), with no installation, configuration or php.ini tweaking required. This library is most often used to GZIP data sent to the browser on the fly, to make the transfer of web pages smaller and, therefore, quicker. Nowadays, most web browsers support this functionality and advertise it by sending the Accept-Encoding: gzip or Accept-Encoding: deflate HTTP headers (or both). If this is the case, PHP can send compressed data across the wire if the following php.ini setting is enabled: zlib.output_compression = On
Note that (theoretically) nothing can go wrong if the browser does not support GZIP compression because, in that case, no corresponding Accept-Encoding HTTP header is sent and, therefore, PHP does not GZIP the data. Older versions of Netscape have a bug with embedded, compressed media, but do not have a reasonable market share any longer. However, the Zlib extension can also be used to compress files on the fly—as well as to uncompress them, of course. In contrast to the ZZIPlib library, this extension also allows to create archives. The standard steps apply, again with a couple of new function names:
March 2005
●
PHP Architect
●
www.phparch.com
Here are the relevant lines (complete code in Listing 4): $out = gzopen($outfile, ‘wb4’); gzwrite( $out, file_get_contents($infile) ); gzclose($out);
The second parameter to gzopen() is the file mode (‘‘wb’) plus the (optional) compression level, in this case an average four. There are other options, including level-nine compression, which creates smaller files (and requires larger memory footprints). Uncompressing files works in a very similar way: • Open an archive using gzopen(). • Read data from the archive using gzread(), until gzeof() returns True. • Close the file using gzclose(). Again, here’s a simple snippet of code, taken straight out of the larger example that you can find in Listing 5: $in = gzopen($infile, “rb”); $out = fopen($outfile, “wb”); while (!gzeof($in)) { fputs($out, gzread($in, 1024), 1024); } gzclose($in);
Listing 4 1
Listing 5 1
13
FEATURE
Crunching Data with PHP
fclose($out);
The Zlib extension offers some other functions, including gzfile() and gzpassthru(), which work similarly to file() and fpassthru(), but also uncompress data from the (GZIP) file pointer provided. Similarly, gzcompress() allows to directly compress a string (that was retrieved, for instance, by file_get_contents()), whereas gzdeflate() uncompresses a GZIP string into its original form. PHP Streams Starting with PHP 4.3.0, the concept of streams was introduced. They already existed in previous versions, but only in a very limited form: as HTTP wrappers for fopen()), or for Zlib (zzlib:). However, the latter was removed from PHP 4.3.0 onwards and replaced by something less ambiguous. PHP now supports a lot of protocols and wrappers for streams, including two that can be used to compress data: • compress.bzip2:// for BZIP2 • compress.zlib:// for GZIP, the “successor wrapper” to the old zlib:. The installation for stream wrappers works analogously to the procedure we illustrated earlier for PHP extensions. For compress.bzip2://, you need the bzip2 extension (to recall: —with-bzip2 if you compile PHP manually, extension=php_bzip2.dll under Windows). If you want to use GZIP, you have to provide the compilation switch —with-gzip, whereas Windows users have this functionality built-in in their binary distributions. From there on, usage is as simple as working with any stream—it’s like working with a file. You do not have to worry or care about compressing or uncompressing,
but just work with it like you would with any other PHP stream: just read from or write to it, and PHP takes care of the rest in a completely transparent fashion. Here is how it’s done for GZIP—a file is read in, compressed and then deflated: //Compressing $data = file_get_contents($infile); file_put_contents(“compress.zlib://$outfile”, $data); ... //Uncompressing $data = file_get_contents(“compress.zlib://$infile”); file_put_contents($outfile, $data);
The download code for this issue contains the complete listing, including a tweak to make it backwards-comListing 6 1
CVS
March 2005
●
PHP Architect
●
www.phparch.com
14
FEATURE
Crunching Data with PHP
patible to PHP 4 (where file_put_contents() does not exist); Listing 6 shows the complete source code for the same task being performed using BZIP2 compression. PEAR Packages PEAR does not have a category specifically dedicated to archive files, but for file types (pear.php.net/packages.php?catpid=33). There, you will find (as of March 2005) two packages that are relevant to our quest: • Archive_Tar (pear.php.net/package/Archive_Tar) for tarballs • Archive_Zip (pear.php.net/package/Archive_Zip) for ZIP files The first of these two packages is automatically distributed with PEAR, since the installer uses it to deflate and install PEAR modules. Nevertheless, it might be a good idea to run pear list-upgrades to check whether new versions exist (or, specifically, pear upgrade Archive_Tar to install them). Currently, there is no web-based enduser documentation available in the PHP Manual, only one generated automatically from the PHPDoc comments in the source code. There is also a text file in PEAR’s doc directory that contains rather detailed information about the package. Thankfully, the package can be used in a straightforward manner. Again, it’s just a matter of taking the right steps in the right order: • First, load the PEAR module: require_once ‘Archive/Tar.php’;
• Then, initialize the class: $tar = new Archive_Tar(‘filename.tar’); • Finally, provide a list of file names: $tar->create (array(‘/path/to/file1’, ‘/path/to/file2’)); And that’s all: the files are automatically stored into a tarball and compressed. If you do not want to pass a monster array to the create() method, you have two other choices: • Once you have created the TAR file, you can add more files to it using the add() method, again providing an array of files. • Instead of using an array, you can also provide a space separated list of file names—if your file names do not contain spaces. Here is a small example that creates a mini PHP distribution: We take three files from a PHP binary distribution package and compress them into a single tarball:
It is also possible to create subdirectories. In order to do so, you can use createModify() instead of create(), or addModify() instead of add(). As a second parameter, you need to provide the name of the diretory where the files shall be placed:
Listing 7 contains a souped-up version: here, the constructor for Archive_Tar gets a second parameter, the compression mode to be used. Valid values for this
bzip vs. gzip
March 2005
●
PHP Architect
●
www.phparch.com
15
FEATURE
Crunching Data with PHP
parameter are ‘bz2’ for BZIP2 and ‘gz’ for GZIP. Then, the PEAR module automatically compresses the files after merging them into a tarball. Thus, you get both effects: compacting several files into one distribution tarball and then making the latter’s file size significantly smaller. Extracting files from the TAR/TGZ/TAR’ed BZ2 archive is even easier to implement: you just open the archive and then extract its contents to the specified path:
Now to the second relevant PEAR package, Archive_Zip, which, internally, requires PHP’s Zlib extension. Unfortunately, as of the time of this writing, the PEAR module has not seen any release yet (an issue which is, interestingly, also filed as a bug). However, the module maintainer, Vincent Blavet, does react to bug reports and currently maintains the package exclusively in PEAR’s CVS system. You find the current state of the code at cvs.php.net, which also offers web-based access to the repository at http://cvs.php.net/pear/Archive_Zip/. There, you will find one file, Zip.php, that you can download and manually place into the Archive directory of your PEAR installation. Then, require_once ‘Archive/Zip.php’; loads the module. Once you have done this, you can create ZIP archives using this procedure: • Instantiate the class, providing the target file name as the parameter. • Create the ZIP archive with the create() method, providing a list of files (as an array or a comma-separated list). • Optionally, add further files using the add() method. As you can see, the syntax is quite similar to the one used by Archive_Tar. The main difference is probably in the way you can extract files into a specific file location: you set the add_path option when using create() or add(). The following code snippet shows this; Listing 8 contains the complete code for this example: Listing 8 1
The file created by this script indeed uses the directory name provided. Getting the files back can be accomplished with very little code as well: you just need to open the file and call extract(). Again, the add_path parameter can set a path to be used, this time for deflating the data in the archive into.
Both PEAR packages offer some more functionality, but the elements I have just shown should fulfill most requirements. Feel free to explore the source code further for some interesting insights—and maybe ask the maintainer of Archive_Zip to officially release his (really nice and functional) package! Notable External Scripts Although the methods for manipulating archives using PHP I have shown so far are excellent, there are always alternatives—one of the good things of Open Source. For instance, the SourceForge project called PKZip library for PHP, available at http://sf.net/projects/phpziplib/, provides a nice alternative way to create ZIP archives, but one that has very limited support for reading in ZIP files. You need the Zlib extension and, starting from version 0.3, PHP 5. The reason: the author declares his methods as public or private, which is basically a good thing, but unfortunately not supported by PHP 4. However, the rest of the module is fully PHP-4-compliant, so all you have to do to maintain backwards-compatibility is to remove all occurrences of public and private in the code. You can then include the ziplib.php file in your code and follow these steps (great, more lists!): • Instantiate the class: $zip = new Ziplib; • Add some files with the method zl_add_file(), providing the file’s contents, its name and the compression method. A few words about compression methods: the library supports three possibilities. n stands for none, b for BZIP compression and g for GZIP compression. If the parameter is not provided, the class automatically uses GZIP. After the character for the compression method, the
16
FEATURE
Crunching Data with PHP
compression level follows, ranging from zero (no compression) to nine (maximal compression, both in space saved and time and memory consumed). Here is a small example: require_once ‘ziplib.php’; $zip = new Ziplib; $zip->zl_add_file( file_get_contents(‘php.ini-recommended’), ‘ini/php.ini-recommended’ ); $data = $zip->zl_pack(‘Archive created with PHP!’); file_put_contents(‘test.zip’, $data);
Listing 9 1
Have you had your PHP today?
Listing 9 contains a complete, runnable example. Another package that provides archive manipulation facilities is available at http://www.phpclasses.org/browse/package/945.html—and a quick search Google will turn up more alternatives. Summary As this article has shown, working with archives from within PHP is both possible and quite easy—it is just not as documented and talked-about as other aspects of the language. There are several possible ways to use archives, and all relevant file formats are supported. This functionality can be really useful in your projects: you can create your own download area where users can retrieve files that are compressed on the fly before being sent out to the user (In this case, however, you should implement “funky caching”, meaning that once you have crunched a file, you save the result so that next time someone requests the same data, you save the effort to crunch it again). Also, users may have the opportunity to submit archives to your website and you can have your script take a look what is inside.
About the Author
?>
Christian Wenz is author or co-author of over four dozen books, frequently writes for renowned IT magazines and speaks at conferences around the globe. He is Germany’s very first Zend Certified Professional, principal at the PHP Security Consortium and maintainer or co-maintainer of several PEAR projects.
To Discuss this article:
http://forums.phparch.com/204
http://www.phparch.com
NEW !
ce Lower Pri NEW COMBO NOW AVAILABLE: PDF + PRINT
The Magazine For PHP Professionals
March 2005
●
PHP Architect
●
www.phparch.com
17
FEATURE
Turning a Class Into an Application With PHP-GTK
F E A T U R E
by Scott Mattocks
Tired of having to write a new script for every PEAR package he released, Scott Mattocks decided to wrap the PEAR_PackageFileManager class in a GUI to make generating package files a snap. The following article details the process that he went through to create his application and highlights how you can use PHP-GTK to do the same with your classes.
O
ften times, a wonderful class is written that makes a developer’s life much easier. But almost just as often, the class requires a the developer to write a script that loads a specific set of data, which, in turn, can’t be used again. I first came across this problem when writing dozens of scripts that each loaded a different unit test and simply output the results. I wondered how easy it would be to create an application that loaded the test cases and showed the results without having to write the same code over and over again. I came across the same problem when trying to create PEAR package files—the PEAR_PackageFileManager class makes creating PEAR package.xml files easy, but writing the scripts that load the data drives me crazy. That’s when I decided that it would be easier to create a PHP-GTK application to collect the data than it would be to write those scripts. It turns out that it isn’t that hard at all. Hopefully, you will find the process that I went through helpful when you find yourself in the same situation that I did. Getting Started The first thing to do when turning a class into an application is to take a good look at what you are starting with. It is important to understand how you would use the class in a script if you want to create a useful GUI. The
March 2005
●
PHP Architect
●
www.phparch.com
public methods of the class give a good indication as to what your application should be doing. The PEAR_PackageFileManager class, for example, has an addMaintainer() method. Therefore, our application should have an “add maintainer” feature. This may sound pretty obvious, but consciously thinking about it will give you a good start on setting up your GUI. If you have an incomplete or confusing layout, you might as well stick to writing scripts. Lupus in fabula—before getting down to work, we should probably decide on a general layout for our application. The PackageFileManager class performs several tasks that are mostly independent of one another. This isn’t to say that you can use just one method
REQUIREMENTS PHP
4.3.x
OS
N/A
Other Software
PHP-GTK 1.01
Code Directory
gtk
RESOURCES
i
URL http://qtk.php.net
19
FEATURE
Turning a Class Into an Application With PHP-GTK
and you have a valid package file, but adding a maintainer doesn’t rely on the user adding a dependency first. Because of the design of PEAR_PackageFileManager, I have decided to use a GtkNotebook as the main display widget. A GtkNotebook is a widget with pages that are marked by tabs; the user can click a tab to bring a given page to the top. I am sure you have seen this type of layout several times on various websites, or applications like Mozilla. A notebook layout makes it easy to hide and show only the tools that we need at any given time. The GtkNotebook also helps keep our GUI small by letting us stack things in 3-D, instead of having a huge window with all of the tools displayed at once. Setting up the notebook is easy. After you have created your notebook object, you just insert or append a new page whenever you need one. A page consists of a container holding all of the widgets for the page and a GtkLabel for the tab. You may be asking, “Why do I have to put everything into another container? Why can’t I just put everything into the notebook page?” A GtkNotebook page is a descendant of GtkBin, and descendants of GtkBin can only have one child widget. This may sound like a strange limitation, but it isn’t, really. If you only have one child, you don’t have to worry about ordering, positioning or anything else that comes along with having multiple child widgets. It keeps the notebook pages simple and leaves the more complex container stuff to specialized widgets. To allow us to have a page with more than one widget in it, we just need to make sure that the page’s child is some sort of container that can hold more than one child of its own, like a GtkHBox. Then, we can fill the child container up withever we want, and the notebook page will still only have one child.
If you look at Listing 1, you’ll notice that I have several helper methods that take are of creating each page’s widgets. Each helper method returns an array holding the container for the page, and a label for the tab. You should also notice that I have connected a method to the switch-page signal. This is raised any time the top page of the notebook changes. It doesn’t matter if the page changes because the user clicks on a tab or if our code tells the notebook to bring a different page to the front—by connecting the signal to the showWarnings() method, our code will check for any warnings every time a different page is brought to the front of the notebook. If there are any warnings, our application will bring the user to the warnings page. There are four methods for connecting signals to callbacks. While they do pretty much the same thing, understanding the differences between them can save you a lot of headaches down the road. In this instance, I have used connect_object(). The difference between connect() and connect_object() is that connect() passes the widget that emitted the signal to the callback function. This is useful when you have one callback that is called by multiple widgets and you need to know which widget emitted the signal. connect_object(), on the other hand, does not pass the widget that emitted the signal. Using connect_object() will make the callback methods a little more straightforward. You’ll see an example of when you might need to use connect() a little later, but in this case it would just complicate things needlessly. Putting the Pieces Together Ok. So we have our application all set up and ready to start working. The feature we should probably add first
Listing 1 1
function &_buildNotebook() { // Create the notebook. $this->notebook =& new GtkNotebook(); // Add the addMaintainer page. $this->_addNotebookPage($this->_createSetOptionsPage()); // Add the addMaintainer page. $this->_addNotebookPage($this->_createAddMaintainerPage()); // Add the addDependencies page. $this->_addNotebookPage($this->_createAddDependenciesPage()); // Add the warnings page. $this->_addNotebookPage($this->_createWarningsPage()); // When the page is switched, get any new warnings. $this->notebook->connect_object(‘switch-page’, array(&$this, ‘showWarnings’)); // Return the notebook. return $this->notebook; } function _addNotebookPage($page) { // Add the container and the tab label. $this->notebook->append_page($page[0], $page[1]); }
March 2005
●
PHP Architect
●
www.phparch.com
20
FEATURE
Turning a Class Into an Application With PHP-GTK
is the warnings page. This page will be a visual representation of the getWarnings() method, and will need to display all of the warnings that are generated when the user tries to add or change any information. The widget that we want to use to show the warnings needs to be easy to update and scroll in case there is a lot of information to display. Even though the underlying GTK+ implementation of GtkText is technically “broken,” it is still our best choice in this situation. The GtkText widget is very similar to an HTML textarea: it allows for text to be easily added and will scroll when there is too much to display at one time. We can also set the text area so that the user cannot directly edit the text—this is a good idea to ensure that nobody accidentally deletes a few lines and then gets confused as to why their package file didn’t get built properly. The showWarnings() method that is connected to the notebook’s switch-page signal simply grabs the array of errors from the package file manager and adds each one on its own line in the warnings area. To add the text, we just call the insert_text() method of the GtkText widget. insert_text() takes two parameters, the text to add and the position. As with most string functions in PHP, -1 indicates the end of the string. Listing 2 shows the code for this page. It would be pretty annoying if the warnings just piled up and there was no way to get rid of them. That’s why I have also added a “clear” button. A GtkButton is a container widget that listens for events from the user, like pressing a key or clicking with the mouse. When a user clicks on a button, the appropriately named clicked signal is emitted. We want the clear button to
get rid of all of the warnings, so we connect the clicked signal of the clear button to the delete_text() method of the GtkText widget. As you can see, you don’t always have to connect signals to your own functions—you can connect them straight to another widget’s methods instead. Figure 1
Listing 2 1
function _createWarningsPage() { // Pack everything in a vBox. $vBox =& new GtkVBox(); // Set up the warnings area. $this->warningsArea =& new GtkText(); $this->warningsArea->set_editable(false); $this->warningsArea->set_word_wrap(false); // Add a button to clear the warnings area. $hBox =& new GtkHBox(); $button =& new GtkButton(‘Clear’); $button->connect_object(‘clicked’, array(&$this->warningsArea, ‘delete_text’), 0, -1); $hBox->pack_end($button, false, false, 5); // Pack everything in. $vBox->pack_start($this->warningsArea, true, true, 10); $vBox->pack_start($hBox, false, true); return array($vBox, new GtkLabel(‘Warnings’)); } function showWarnings($message = NULL) { if (!empty($message)) { $this->warningsArea->insert_text($message . “\n”, $this->warningsArea->get_length()); } foreach ($this->_packageFileManager->getWarnings() as $warning) { $this->warningsArea->insert_text($warning[‘message’] . “\n”, $this->warningsArea->get_length()); } }
March 2005
●
PHP Architect
●
www.phparch.com
21
FEATURE
Turning a Class Into an Application With PHP-GTK
Hopefully, you picked up on the two extra arguments at the end of the connect_object() call. These two are user data that will be passed to the callback function. They are passed, in order, after any arguments that are automatically added by the callback. Here, we passed zero as the start character to be deleted and -1 as the last character to be deleted. When the user clicks on the clear button, everything in the text area will be discarded. If you look at Figure 1, you’ll see that the clear button appears on the right of the page. This is because I used pack_end() instead of pack_start(). This function works just the same way as pack_start(), with the exception that it adds widgets to the end of the container. For GtkVBoxes, this means that widgets are packed from the bottom of the container up. For GtkHBoxes, children are packed from right to left. There will be more on packing widgets in just a few paragraphs. Next, let’s look at adding a maintainer. This is where we get into really wrapping the PackageFileManager class into our GTK application. A maintainer is someone who contributes to a PEAR package. In the package.xml file, they are identified by four pieces of information: their handle, their name, their email address, and their role in the package. As a result, the addMaintainer() method expects these four pieces of information as arguments. The number and type of parameters a function takes gives us a clue as to what kinds of widgets we will need. The developer’s handle, name and email address can be just about anything, so GtkEntry fields will probably be the best fit. A GtkEntry is very similar to an HTML text input box. It allows the user to enter one line of text. The role, on the other hand, has some predefined legal values; therefore, a GtkCombo will work best for this type of data. The layout of this page is going to be a little more complicated than the one for the warnings page. A lot of developers might use Glade to take care of the interface work, but it really isn’t that difficult and I think you learn more if you do it yourself. Anyway, let’s get down to business. We already have a method to set up the notebook page that we are going to use, so all we have to do is add our new maintainer widgets. We will do this by adding a method called _createAddMaintainerPage(). This method will create our information-gathering widgets, plus add a way for us to get the information to the package file manager. Take a look at Listing 3—the layout of the widgets is controlled using a combination of GtkVBoxes and GtkHBoxes. Take note of the three parameters at the end of pack_start(): these are often forgotten, but can save you lots of headaches down the road. The first, fill, tells the container whether or not the child widget should be resized to take up all of the available space when it is added. The second, expand, lets the contain-
March 2005
●
PHP Architect
●
www.phparch.com
Listing 3 1
22
FEATURE
Turning a Class Into an Application With PHP-GTK
er know whether or not it may resize the child widget when the container is resized. The last parameter is the amount of padding added around the child widget when it is added to the container. The same three parameters also apply if you are using pack_end(). If you have every wondered how to stop widgets from being bigger than you told them to be, now you know.
“...PHP-GTK isn’t quite as difficult or scary as its reputation may have led you to believe. “ The GtkEntry widgets used for the developer’s handle, name and email address are pretty basic, so I am going to focus on the GtkCombo for the role instead. A GtkCombo is, as its name implies, a combination of a GtkEntry and a GtkList. When we set up our page, we can deal with each piece of the GtkCombo separately, but still treat them as one widget when it comes time to position them. First, let’s look at the GtkEntry portion. If we didn’t modify it, the entry part would function just like any other GtkEntry in our application, but the whole idea of using the GtkCombo was so that users had to select the role from a list, as opposed to typing it themselves. To make sure that only the list values are used, therefore, we need to prevent the user from editing the text in the entry area. This is done the same way as we did for the warnings area, with set_editable(). Setting up the list is a little more complicated; a GtkList is a container that will only accept GtkListItems as children. To add the developer roles, we have to add one GtkListItem for each. When the list items are created, each one gets tagged with its role. This is done using set_data(), which is a GtkObject method—this means that all widgets have it. It comes in handy if you just want to mark an object with a certain value. It is easier for us to call get_data() to retrieve the role than it is to loop through the children of the selected list item. Before we exit from this function, we need to create a button that we can use to add the new maintainer information to the package file manager. We do this by first creating a GtkButton, which we will label with ‘Add Maintainer’. Then, we connect the clicked signal to the _addMaintainer() method. In the middle of Listing 3, you should see that I have decided to use connect_object() for this purpose. Again, this is because the function we are connecting to the clicked signal does not need to know which button was pressed and, therefore, doesn’t expect the widget as its
March 2005
●
PHP Architect
●
www.phparch.com
first parameter. We add all of the information widgets to the connect_object() call because our method for adding a maintainer needs to extract the developer’s information. If you try to pass $emailEntry>get_text(), you are passing the return value from that method at the time you connect the signal. I have seen many developers do that and then wonder why their application isn’t working the way they want. We need to pass the widgets themselves so that we can get the return value from get_text() at the time the user clicks the button. The method can then pass the correct values along to PackageFile_Manager::addMaintainer(). After we’ve added the maintainer, it is nice to let the user know what happened and then clear out the old data so they can add another developer. To do this, we pass one more widget to our _addMaintainer() method—a GtkLabel. When the user hits the “Add Maintainer” button, we just set the label text to an appropriate message. Ok, that was easy. Let’s take a look at something a little tougher. The whole idea of creating our desktop application is to make things easier to use. Consider what it might take to implement the “add dependency” feature. We need two pieces for this page: one for adding the packages that can be used as dependencies, the other to show the dependencies that we have already added. For adding the dependencies, we could do the same thing we did for the add maintainer feature, but that would let the user add anything they want as a dependency and would open up a lot of room for errors. It would be a big help to the user if we instead grabbed the available packages and let them chose from what’s available. We can get the list of installed packages by using the PEAR_Registry class, which lets us grab information about installed packages, such as their name, current version number, description, and changelog. This will be perfect for getting our dependency information, while a GtkCTree widget is the perfect choice for displaying it in a hierarchical fashion. We need a hierarchy here because we want to let the user select not only a particular package Listing 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
23
FEATURE
Turning a Class Into an Application With PHP-GTK
but also the minimum version needed. Listing 4 shows how to build the tree. After a little set up, we instantiate a PEAR_Registry object and loop through the results of packageInfo(). For each package found, we build a node with the package name as the label. We also build a child node for the version and one node for each version in the changelog. The most important part is the call to node_set_row_data(). By passing an array containing the package name and version number, we are tagging
Now how about resizing the columns? This gets a little tougher. You actually have to pass some parameters. To have the columns automatically resize, call column_set_auto_resize() passing the column number (zero in our case) and true. This will make the first column stretch and shrink to fit its widest entry. The other columns will adjust as needed without affecting the size of the widget. Listing 5 shows how we grab the data from the tree and pass it to the list and package file manager. When
“The public methods of the class give a good indication as to what your application should be doing.” each row with the corresponding information. Then, when the row is selected, we can grab that information and pass it on to the package file manager. If you look at the manual page for the tree-row-selected signal, you will see that the callback method gets not only the tree but also the node that was selected. By setting up our call to connect() and our _addDependency() methods, we can grab the right package and version and pass it on to the package file manager and the widget that displays the current list of dependencies. To show the current dependencies, I have decided to use a GtkCList widget. To set up the list, we need to pass the number of columns and an array with the column labels. A GktCList is a pretty complex widget if you get into all the bells and whistles, but all we need to do is add entries, sort them and resize the columns as appropriate. Let’s start with sorting the columns, because that is the easiest of our operations. After building the widget, we just call the set_auto_sort() method—we don’t need to give it any parameters or call any other methods. The GtkCList will now automatically sort our packages by package name and version number. Simple enough, right?
a node from the tree is selected, our callback method grabs the data using node_get_row_data(). After verifying that we have retrieved the package information successfully, we just add the package to the list by calling insert(). We don’t have to worry about what position at which to insert the information, because the list will automatically sort its entries anyway. For both parts of this page, it is pretty easy to picture the widgets running out of room quickly. Instead of making our GUI huge to accommodate the lists, we can put each widget inside a GtkScrolledWindow container. When the child inside the scrolled window gets too big, scroll bars will appear. You can control when the scroll bars are visible by setting the scroll bar policy. I set the tree’s scrolling window to show the vertical scroll bar only when it is needed and to never show the horizontal scroll bar. When the user expands the tree, the scroll bar will appear. The next piece to our puzzle is the “set options” feature. A lot of the work for PEAR_PackageFileManager is performed by setOptions(), which takes an associative array of options where the key of each element provides the option’s name and the corresponding value
Listing 5 1
function _addDependency($tree, $node, $unknown, $list) { $data = $tree->node_get_row_data($node); if (count($data) == 2) { $result = $this->_packageFileManager->addDependency($data[0],$data[1]); // Check for errors. if (PEAR::isError($result)) { $this->_pushWarning($result->getCode(), array()); //return; } // Add the dependcy to the clist. $list->insert(0, $data); } }
March 2005
●
PHP Architect
●
www.phparch.com
24
Turning a Class Into an Application With PHP-GTK
the option’s value. To make things easier for the end user, each option should be its own widget. There are lots of options that can be set, but I am just going to focus on one of the more interesting ones—the package directory. The package directory is the directory that contains all of the package files. The best widget for selecting files or directories is the appropriately named GtkFileSelection, which is similar to the save or open file dialogs you are used to seeing in most applications. A file selection is a big widget that appears on its own. We don’t always want it to be shown, so we will use an entry to show the file path and a button to show the file selection when it is needed. Setting up the entry and the button should be pretty much a matter of routine by now, so let’s jump straight to the file selection widget. When the “Select” button is clicked, it calls the file selection’s show() method. Because GtkFileSelection extends GtkWindow, we can control its position when it pops up. Instead of letting it appear wherever the operating system feels like, let’s make it show up in the midListing 6 1
Scott Mattocks is a developer for Yamaha Music Interactive. When he isn’t working with PHP and PHP-GTK, he enjoys relaxing at home with his wife and dog. If you have any questions or comments feel free to email him at [email protected].
To Discuss this article:
http://forums.phparch.com/205
Award-winning IDE for dynamic languages, providing a powerful workspace for editing, debugging and testing your programs. Features advanced support for Perl, PHP, Python, Tcl and XSLT, on Linux, Solaris and Windows.
Download your free evalutation at www.ActiveState.com/Komodo30
NEXCESS.NET Internet Solutions 304 1/2 S. State St. Ann Arbor, MI 48104-2445
http://nexcess.net
PHP / MySQL SPECIALISTS! Simple, Affordable, Reliable PHP / MySQL Web Hosting Solutions P O P U L A R S H A R E D H O S T I N G PAC K A G E S
MINI-ME
$
6 95
SMALL BIZ $ 2195/mo
/mo
500 MB Storage 15 GB Transfer 50 E-Mail Accounts 25 Subdomains 25 MySQL Databases PHP5 / MySQL 4.1.X SITEWORX control panel
2000 MB Storage 50 GB Transfer 200 E-Mail Accounts 75 Subdomains 75 MySQL Databases PHP5 / MySQL 4.1.X SITEWORX control panel
16 95
/mo
900 MB Storage 30 GB Transfer Unlimited MySQL Databases Host 30 Domains PHP5 / MYSQL 4.1.X NODEWORX Reseller Access
NEXRESELL 2 $
We'll install any PHP extension you need! Just ask :) PHP4 & MySQL 3.x/4.0.x options also available
59 95
/mo
7500 MB Storage 100 GB Transfer Unlimited MySQL Databases Host Unlimited Domains PHP5 / MySQL 4.1.X NODEWORX Reseller Access
: CONTROL
php 5 4.1.x
POPULAR RESELLER HOSTING PACKAGES NEXRESELL 1 $
NEW! PHP 5 & MYSQL 4.1.X
PA N E L
All of our servers run our in-house developed PHP/MySQL server control panel: INTERWORX-CP INTERWORX-CP features include: - Rigorous spam / virus filtering - Detailed website usage stats (including realtime metrics) - Superb file management; WYSIWYG HTML editor
INTERWORX-CP is also available for your dedicated server. Just visit http://interworx.info for more information and to place your order.
WHY NEXCESS.NET? WE ARE PHP/MYSQL DEVELOPERS LIKE YOU AND UNDERSTAND YOUR SUPPORT NEEDS!
php 4 3.x/4.0.x
128 BIT SSL CERTIFICATES AS LOW AS $39.95 / YEAR DOMAIN NAME REGISTRATION FROM $10.00 / YEAR GENEROUS AFFILIATE PROGRAM
UP TO 100% PAYBACK PER REFERRAL
30 DAY MONEY BACK GUARANTEE
FREE DOMAIN NAME WITH ANY ANNUAL SIGNUP
ORDER TODAY AND GET 10% OFF ANY WEB HOSTING PACKAGE VISIT HTTP://NEXCESS.NET/PHPARCH FOR DETAILS
Dedicated & Managed Dedicated server solutions also available Serving the web since Y2K
INTERVIEW
Interview
Priming PHP for the Enterprise I N T E R V I E W
by Marco Tabini
“PHP in the Enterprise” is beginning to sound like “the paperless office.” Luckily, there’s a lot less of a vapourware aura around the former than around the latter, as this interview with Cornelius Willis, IdeaLabs’ Vicepresident of Sales and Marketing, shows.
A
s the new year gets under way, and the topic of Enterprise PHP comes back into our discussions, one company has taken matters into its own hands in a bid to produce a set of open-source tools that large organizations can depend upon. If the largest fear that an enterprise-class organization has of open-source software is the fact that it’s not dependable and that no-one steps up to the plate to offer the appropriate level of support for it (what could otherwise be called the “I have no-one to blame” factor), Seattle, Washington-based SourceLabs was founded precisely with the intent of providing “certified” software stacks that have been thoroughly tested and can be supported in a mission-critical environment. SourceLabs’ approach to the problem is, at least on the surface, very simple. They start by building
March 2005
●
PHP Architect
●
a “stack” of software based either on a customer’s needs or on general common-sense decisions. For example, in our case this may constitute Apache, PHP and MySQL, all running on a Linux platform of choice. The newly-formed stack is then subject to a series of tests, which the company has collectively dubbed “CERT-7.” These include seven different levels of inspections, ranging from security, to scalability, reliability and even regression testing. Once the stack passes the muster in a particular configuration, they recommend its installation, usage and maintenance and provide support contracts for those customers interested in this type of service. While the enterprise-level IT manager will undoubtedly find this service useful for the peace of mind that it affords in the face of increasing IT scrutiny (especially in large public
www.phparch.com
companies), the overall concept could be really useful for everyone who uses open-source software for mission-critical applications. Don’t Take My Word For It… A great idea is always backed (or dispelled) by a set or solid, hard data. In SourceLabs’ case, this was a particularly challenging aspect of their business, as most of the data available out there is hardly… well, hard at all. Therefore, they commissioned a full study from Evans Data to determine whether open-source in general (and PHP in particular) was anywhere near reaching a level of acceptance that would justify— or require—the introduction of testing methodologies and supported stacks. A Friendly Chat Armed with this information, we thought it might have been a good idea to let the SourceLabs folks
28
INTERVIEW
Priming PHP for the Enterprise
speak for themselves and explain to our readers (and to us—we were quite curious ourselves) who they are, what they do, and why the PHP community should take notice. We hooked up with Cornelius Willis, SourceLabs’ Vice-president of Marketing, for a quick Q&A session:
php|architect: Can you introduce our readers to SourceLabs? Cornelius Willis: SourceLabs provides tested, certified pre-integrated open source infrastructure software (“stacks”) free of charge, and sells support and maintenance subscriptions for those stacks. While there are many excellent open source projects, there are no certified and supported combinations of open source systems. We are in the process of assembling and testing our first downloadable stack, based on AMP (Apache, MySQL, PHP) and will continue to test and integrate stacks in the future. php|a: What is your goal? CW: We want to transform the enterprise software market. Software should be provided at much lower cost and without the vendor lock-in that has characterized so many technology business models to date. Open source software provides an instrument that will allow us to accelerate this restructuring. Our goal is to provide
the best of both worlds: software without lock-in, that is also dependable, and backed by the same full service support and maintenance that a legacy enterprise software vendor provides today. php|a: What is your target market? CW: IT users in Global 2000 companies, ASPs, and ISVs php|a: Can you tell us about CERT7? CW: CERT7 is our overall test methodology. It’s how we determine and improve the overall dependability of an open source stack. The testing regime embodied in CERT7 is similar to that used by infrastructure software providers such as Oracle, IBM, SAP and Microsoft, but is something that is essentially unprecedented in the open-source world. Open-source testing is usually limited to functional testing (e.g.: ”does it work as per the design requirements?”) within the context of a single product module. CERT7 testing takes a more holistic view of the dependability and performance of an entire integrated stack, as Figure 1 calling out “system” level testing indicates. You can find out more about CERT7 and subscribe to the CERT7 forum at: http://www.sourcelabs.com/c7.php
php|a: Tell us about your LAMP stack CW: We are currently working on a certified AMP (Apache, MySQL and PHP) stack, which we will test on the leading Linux implementations. We are also considering doing a version tested on Windows. We’d be interested in hearing from the php|architect community about their interest in such an offering. This distribution will have been tested and certified per the CERT7 test regime, and we will share our certification test results to help buyers understand the reliability profile of the software in unprecedented detail. We will also provide critical updates as needed and regular service packs. php|a: How does your LAMP stack fit with Zend’s new Platform product? Are you a competitor? CW: No. Zend Platform is focused on providing the reliability, scalability and interoperability businesscritical PHP applications need. It delivers features that enterprise developers and system administrators require: run time diagnostics, performance management, PHP configuration and control, and interoperability with Java. SourceLabs’ AMP stack will be freely downloadable, and its main feature is its certification process and SourceLabs support contract for it. Customers could definitely use
Figure 1 CERT7 Test Areas
Legacy Enterprise Software
Open Source Software
SourceLabs CERT7 Testing
Unit Functional Testing
System Functional Testing
System Stress Testing
System Scalability Profile
System Failover Profile
System Security Testing
System Regression Testing
March 2005
●
PHP Architect
●
www.phparch.com
29
INTERVIEW
Priming PHP for the Enterprise
Zend Platform with a SourceLabs AMP stack. php|a: Why did you commission the Evans Data report? CW: We wanted to get a clear understanding of LAMP stack adoption as we entered the market. Frankly, the data we’d been hearing directly from IT strategists and buyers was at odds with the hype in the marketplace, so we wanted to understand how developers were spending their precious training and evaluation time. Profiling the habits and adoption patterns of this population is what Evans Data specializes in. The data is fascinating: it shows that developers are heavily investing in MySQL, PHP, and Perl, which is a good leading indicator. This is substantiated by the success of lots of open-source projects with the developer community, a phenomenon that many of us are familiar with. But our primary research (and direct interviews with target Fortune 2000 IT organizations) shows us that lack of integrated systems, lack of mission critical support, and lack of dependability certification and testing is impeding IT adoption of open source infrastructure beyond Linux. The conclusion we’ve reached is that developers, as one would expect, reflect the leading edge of a larger trend. The opportunity for SourceLabs is giving the IT buyer confidence in open source infrastructure: one throat to choke for support on integrated, tested and certified open source systems, without technology agendas or lock-in business models.
evaluated, and tested, with 48.5% of database developers (and it is gaining even more). This is an example of where the gap between what developers download and use and what their IT gatekeepers are aware of and sanction, is notable.
ing languages are strong, but still have a ways to go to overtake the Microsoft-supported scripting languages. This tracks (no surprise) directly to the displacement of Windows server infrastructure by Linux. This is one of the reasons we think that a WAMP offering might be interesting to the market, as a way to provide a migration strategy for mixedserver shops. (See Figure 2 for a breakdown)
3) Adoption of PHP, as a signal indicator of open source scripting adoption, lags in the enterprise space (just over 40% evaluating, intending to or using) with the
2) MySQL is widely used, Figure 2
will use or are evaluating
not evaluated or will not use
Javascript
78%
19%
VBScript
62%
37%
Perl
54%
43%
PHP
48%
50%
Figure 3
php|a: Can you give us some of the important highlights from the report? CW: The data gives a fascinating snapshot of some of the places where open source development is growing, and where it is lagging. Some of the highlights: 1) The open source scriptMarch 2005
●
PHP Architect
●
www.phparch.com
30
Priming PHP for the Enterprise
largest evaluated/will not use response, but is strongest (near 60% evaluating, intending or using) among small VARs/SIs. This datapoint at once shows the opportunity for companies like Zend and SourceLabs, and also the ongoing challenge of delivering demonstrated dependability that IT organizations demand. (See Figure 3) php|a: What picture does the report paint of LAMP? CW: I think the report shows that LAMP is just entering mainstream adoption now. When I talk to people who are deep in the open software movement, and who have been using LAMP for years, they often perceive this as a naïve, marketing guy sort of comment. But LAMP has a way to go among US IT buyers who aren’t used to doing their own testing, integration and support. With some of these services coming on line from companies like SourceLabs, LAMP is really coming into its own, and emerging as one of the most important platforms of this decade. php|a: How do you see the data provided by the report changing in the near future? CW: Clearly, Linux, MySQL and PHP adoption are on
FEATURE a dramatic upswing. Apache is reaching saturation, but more and more companies are trying out open source infrastructure higher in the stack every day. php|a: What about the long term? CW: We’ll see a much higher penetration of enterprise IT shops with LAMP (and other OSS infrastructure); over the next year or two, companies and organizations like Zend, Apache, MySQL will play a big part in this. Open-source software gives developers and users of software a fundamentally better way to transact, by cutting out the structural and pricing inefficiencies of the current business model. This is similar to the way Napster restructured our thinking about the music business. At SourceLabs, our goal is to deliver services that will give IT customers confidence in open source and in so doing dramatically accelerate that restructuring.
To Discuss this article: http://forums.phparch.com/206
Available Right At Your Desk All our classes take place entirely through the Internet and feature a real, live instructor that interacts with each student through voice or real-time messaging.
What You Get Your Own Web Sandbox Our No-hassle Refund Policy Smaller Classes = Better Learning
Curriculum The training program closely follows the certification guide— as it was built by some of its very same authors.
Sign-up and Save! For a limited time, you can get over $300 US in savings just by signing up for our training program! New classes start every three weeks!
http://www.phparch.com/cert
March 2005
●
PHP Architect
●
www.phparch.com
31
FEATURE
Strengthening the Authentication Process
F E A T U R E
by Graeme Foster
T
here are many occasions in which we require a user to login to access different parts of the web site. This process is so common that few users are confused by a login screen. The problem is that when the user sends this data, it is transmitted in plain text (assuming you’re not over an HTTPS connection). This means that anyone who has access to the network can discover the password by examining the packets that are sent across it. In many cases, the problem doesn’t warrant the extra expense of an HTTPS connection, but it would be nice if the password could be sent encrypted rather than in plain text. This is where the PHP programmer needs to turn away from PHP and look at how JavaScript can be used to solve the problem. A careful blending of the two technologies can provide an excellent solution that will deter all but the most determined hacker. Security is a balancing act between what is the value of the data or access to certain privileges against the effort required to get at it through unauthorized means. A bulletin board probably comes fairly low on the security horizon, whereas credit card data is right up there with heads of state. This article is not a replacement for security tools like HTTPS, but it can help with the dozens of low-priority login situations which barely register a blip on the security radar. Login Screen I’ve lost count of the number of times that I have writ-
March 2005
●
PHP Architect
●
www.phparch.com
ten a login procedure. I would joke that I could do it in my sleep—and that is when it starts to get dangerous, because I’m coding without thinking, or, at least, I’m not thinking about the reason for writing the code. In fact, let me challenge you to pause for a moment and think about what is the purpose of a login screen. We request a user to login so that we know who they are and, once their identity has been established, we can determine what they can do. There are two distinct parts to this process: authentication and authorization. The authentication process is about ascertaining who the person is; the web server says something akin to “Stop! Who goes there?” while the browsers may respond “Don’t worry, it’s only me.” Obviously, the
data we get needs to be a bit more specific than that, which is why we have the familiar login screen. The “Stop! Who goes there?” is replaced with the request to enter your user name and password. The reply is now formalized: I enter my name and my password, so my reply becomes “Hi, I’m Graeme and my password is ‘Help’.” I won’t go into the details of what makes a good password—suffice to say that “Help” isn’t one! Once the user has gone through the authentication process, the server is able to determine who the person at the other end of the Internet is. By referring to local records, the server is also able to determine what that user is authorized to do. “Okay, so you’re Graeme and, according to my records, you’re the administrator; that means you can perform the list of actions in the adminAction.inc file.” This is the authorization process and, when I’m coding, it attracts the majority of my attention—usually because it takes up most of the time. A typical approach would be to break the users down into different groups and, for each group, create a list of valid actions. This works fine for small systems, but as the groups and authorization process become more complex, the development of ACLs may become necessary. What I want to look at here is not the authorization process, but, rather, the authentication process—that is, the process of actually finding out who is at the other
end of the line. This process is very simple, but within its simplicity lies the danger of complacency. First, let me provide the code for a simple login screen, made up of two files—let’s call the page that contains the form login1.php (Listing 1) and the page that the data submitted by it check1.php (Listing 2). The login process is straightforward to implement and it is a pervasive means of authentication. Listing 1 contains a bare-bone login form which consists of two text boxes and two buttons, as shown in Figure 1. When either of the buttons is pressed, the form will post the name and password details to the check1.php PHP script. This, in turn, will capture the POST variables and then see if they match with a registered user/password pair. In this case, there is only one valid combination, which has been hard coded, but, for the moment, that should be sufficient to illustrate the process. The login form in Listing 1 has some space for error messages to be returned to the user. In this article, I don’t use them at all—however, they are important to advise the user if they have omitted to complete a field. If the authentication process fails, it’s important to always use a very generic error message. Sending out a message like “The user name is unknown” is very helpful to the user, but it is of more help to someone who is intent on breaking into your system. A more generic
2 3 5 6 The title goes here 7 8 9 The title goes here 10 The subtitle goes here 12
This is a bogus paragraph, all the paragraphs of this page will look like this one.
13 14
March 2005
●
PHP Architect
●
www.phparch.com
55
FEATURE
An XML approach to Templating using PHPTAL
Associative Arrays Let’s now examine a more complex example using arrays of associative arrays. We’ll start with the logic or driver, which is very similar to the previous one; you can see it in Listing 4 (which I would call example2.php). We only define a title (line 6) and then an array, named $links, containing associative arrays (from line 7 to line 17). $links contains information about other pages, including their names, URLs and descriptions. You have probably noticed that the second element in the array does not have a description—you’ll see why in a moment. Lines 19 and 20 export the variables to our template and the rest is the same as in the previous example. The template is shown in Listing 5; the only new thing here is the table, where the links variable is iterated through. Each of its elements will be accessed as link. Since each item is an associative array, however, we’ll refer to $link['name'] as link/name, $link['url'] link/url $link['description'] as and as link/description. As I mentioned earlier, this is inherited from the Zope model, where almost everything is an object that is accessed through an URL (you can even access folders, scripts and functions inside the scripts as URLs). On the last field of the table, description, you’ll see something new: |default. This operand tells PHPTAL that, if it cannot find the variable whose content should be used in the template (llink/description in our case), it should use the current (or default) content of the tag. So, the www.php.net page, which has no description as we noted earlier, will have No description available as its description (after all, it doesn’t really need one, does it?). You can see the output of this program in Listing 6. Line breaks may not be respected by PHPTAL, but, since they have no meaning in XML (other than single blank spaces between words), you shouldn’t really care. I would recommend sending the output of PHPTAL through some kind of filter (like tidy) to remove all the unneeded blank spaces anyway (that way, you’ll save bandwidth). More TAL Now that we have a better grasp on the basics of TAL, it’s time for me to introduce a few more TAL attributes that can be useful for more serious templating.
Listing 4 1
Listing 5 1 2 3 5 6 The title goes here 7 8 9 The title goes here 10
11 12
Name
13
URL
14
Description
15 16 17
18
Example
19
20 http://www.example.com 23
24
25 No description available 26
27
28 29
30 31
Listing 3 1 2 3 4 5 The first example to PHPTAL 6 7 8 The first example to PHPTAL 9 Isn't it cute ?
This is the first paragraph of this page
This is the second paragraph of this page, this might not be a good way to build pages, but it's a good way to explain how PHPTAL works
10
March 2005
●
PHP Architect
●
www.phparch.com
56
FEATURE
An XML approach to Templating using PHPTAL
One that doesn’t seem to make much sense at first is define, which creates another variable as if we had exported it from our PHP script. When you build big and complex pages, however, you’ll find it very useful. Let’s take a look at an example:
Later on, you can use it as follows:
Blah!
Various variables can be defined in the same define attribute, like so:
div opening and closing tags will disappear, this will remain
The comment parameter, as its name implies, allows you to embed comments in your template—something useful for the programmer and graphical designer that won’t be included in the output. It’s an excellent tool for the programmer to explain what his intentions are to the graphical designer. For example:
When you need to only display the text of an element (that is, its contents without the tag), but, for some reason, you need to include the tag in your source code (for example, to define a loop, condition, macro, and so on), you can use the omit-tag parameter (personally, I’ve never found a need for it—I prefer to use a block tag instead):
There’s also an error-catching parameter called onerror. It can be used to provide PHPTAL with a string that will be shown in case a variable can’t be accessed. One very interesting usage of this parameter would be as follows: ...
Here, errorMessage is a dynamically-generated error message containing contact information and, possibly, an error number—something like: “We are sorry to inform you that an unexpected error occurred. Please inform the administrators of this situation by writing an e-mail to [email protected]. Please include the following error number in your message: 47352. Thank you.” If anything fails when rendering the page, that error will be shown. A very common failure is caused by trying to access a variable that is not defined (and doesn’t contain a default value). Macros in TAL When you build a large website, you’ll probably have a generic header that should be on every page, as well as a generic footer and, probably, a menu that applies to all the pages. That is where METAL—Macro Extensions to TAL—comes in handy. Normally, the METAL namespace is assigned to the metal prefix in an XHTML template with the following parameter to the html tag:
METAL defines four parameters—let’s take a look at them. To define a macro, we use define-macro, as in the following example: php|architect The Magazine For PHP Professionals
The tag will be part of the macro (the same behaviour we saw with tal:repeat and the
tag earlier in the article). To execute a macro, we can use the use-macro parameter; for example, the following code could be used to execute the macro called “header:” This will be replaced by the header.
If the “header” macro is part of a different file called usefulmacros.html, the previous example would look like this:
xmlns:metal="http://xml.zope.org/namespaces/metal" This will be replaced by the header.
Dynamic Web Pages www.dynamicwebpages.de sex could not be better | dynamic web pages - german php.node
Here, again, you can see how using slashes as separators makes sense. The other two parameters defined by METAL allow us to pass parameters to a macro as if it were a function (the parameters are called “slots”). A slot is defined by using the define-slot parameter. For example: php|architect The Magazine For PHP Professionals
The caller would use the fill-slot tag to pass a subtitle, like this: Header here! Home page
The same code could also be written this way: Home page
The metal:block tag works analogously to tal:block— it acts as a template-level container for content that will be removed from the output.
Listing 7 1
Macros in Action We’ll now take a look at a very complete example that will make use of macros. This example could serve as a starting point to build a real-world web site. You’ll notice that we will be getting very close to building a CMS (Content Management System)—I believe that every web site, deep down, is some kind of CMS, based on the fact that I’ve had to add some level of CMS functionality in pretty much every site I have built.
“PHPTAL is an implementation of TAL, the Template Attribute Language, together with its own set of extensions.” Let’s start with the logic, which is extremely simple; you can see it in Listing 7. It doesn’t define any variables—it just outputs example3.html. In listing 8, you can see the template that, in real life, would be named example3.html. Line 5 shows our new namespace. From line 6 to line 10 we have the XHTML header that creates the title, keywords and description tags, all from macros located in example3-content.html (Listing 10, described below). From line 11 to line 17, we have the body, also created out of macros, this time taken from both example3content.html and example3-helper.html. Inside the body, on line 15, we are filling a slot that should be defined in the example3-helper.html/footer macro. Let’s move on to example3-helper.html, shown in Listing 9. Here, you can see an important concept for the collaboration between graphical designer and programmer at work. example3-helper.html is a fully-functional XHTML file, but only some parts of it (the macros) will be used. Nothing from Lines 1 to 12 is needed for the system to work properly, but the code is there to welcome the graphical designer who is going
to work on it. It introduces the various special tags to him, so that the chances of the file getting broken are smaller (than if that information was buried in a manual nobody ever reads). Lines 13 to 18 create the menu macro, while lines 21–24 create the footer macro, which has a slot as expected, defined on line 23. Now the last file, example3-content.html, which can be seen in Listing 10. We have lots of macros defined there, for the title, the keywords, descriptions and content. You can see how the title macro is defined in the tag, the keywords macro in the keywords tag, and so on. These tags are likely editable through HTML applications, which, in turn, should respect the METAL parameters and leave them untouched while, at the same time, make it really easy for anyone to change the contents of the macros. Summary In this article, we barely scratched the surface of what PHPTAL can do—there is a lot more to it. The truth is that real web sites tend to be very complex and, if you try to write them using only what we’ve examined this far, you’ll find lots of gaps that are not visible
at first sight. In the next article, we’ll cover the remainder of PHPTAL’s features, some techniques (those useful things that are never in a manual and that only experience can teach you) and compare PHPTAL to other templating systems.
About the Author
?>
José Pablo Ezequiel Fernández Silva, who is also known as “Pupeno,” is a software developer who has been building web sites since 1997 using Apache, CGI, PHP, MySQL, Zope, Python and Plone, among other tools. He has developed desktop applications, and spoke at conferences and Free Software events. Pupeno has also participated in the creation of a Linux-based set-top box for DVDs. He is currently researching better concepts for graphical interfaces and desktops.
Dear graphical designer, these are small pieces of XHTML that will be included in the output page, feel free to change anything inside them but be sure <strong>not to remove the metal:define-macro and metal:define-slot parameters nor the metal:block tags, since they are needed by the program.
This is an example of extensive use of METAL of TAL in PHP using PHPTAL.
This page is built of a lot of small pieces that are spread in different files.
March 2005
●
PHP Architect
●
www.phparch.com
60
NEW live online training courses start in
Certification Central
SECURITY CORNER
S E C U R I T Y
C O R N E R
Security Corner
Magic Quotes by Chris Shiflett
Welcome to another edition of Security Corner. This month's topic is magic quotes, the collective term for the behaviour of the magic_quotes_gpc, magic_quotes_sybase, and magic_quotes_runtime configuration directives. My approach this month is a bit different, because instead of highlighting a best practice or describing a security safeguard, I want to warn you against relying on insufficient protections. Betting all your security money on magic quotes is more than just a red herring-—it is actually a very poor practice, and this month's Security Corner will explain why. Filter Input — Escape Output As frequent readers of Security Corner already know, you should always filter input and escape output. These are key components of a secure application’s foundation, and a good software design focuses on making these tasks easy for developers. The major problem with the magic quotes directives isn’t just that they usurp a developer’s control (they do), but also (and especially) that they take the wrong approach—they escape input. Remember, you should filter input and escape output, not escape input. magic_quotes The PHP manual page for (http://www.php.net/magic_quotes) states that: Magic Quotes is a process that automagically escapes incoming data to the PHP script. It’s preferred to code with magic quotes off and to instead escape the data at runtime, as needed. Why Is This So Bad? In the May 2004 edition of Security Corner, I spoke about data filtering with a specific focus on filtering input (external data). A best practice concerning data filtering is to take a whitelist approach—consider all data to be invalid unless it can be proven valid. While not always practical, this is the safest approach. When data is escaped before it is filtered, the filtering logic becomes more complex, and unnecessary com-
March 2005
●
PHP Architect
●
www.phparch.com
plexity is never a best practice, regardless of the topic. When the topic is security, complexity breeds mistakes, and mistakes in data filtering logic represent a security weakness at the very least and, most likely, a vulnerability. I refer to this complexity as unnecessary because the complexity introduced by escaping data before filtering it can be removed by simply performing these tasks in the proper order. Filtering Versus Escaping Filtering is specific to the type of data you’re dealing with. If you’re attempting to prove the validity of a user’s year of birth, you might require that it be a positive integer that is four digits long and fits within reasonable limits. Thus, filtering is just whatever logic you use to test data and verify your assumptions. This is often a weak point in an inexperienced developer’s implementation, because it’s not very intuitive that you should have to prove your assumptions. This is especially true when you use a select form element or something similar that restricts what legitimate users can enter. Escaping is quite a bit different. This is the process by which you prepare data to be sent to some remote
Note: this article speaks mostly about the magic_quotes_gpc directive, which is by far the most common. For more information on magic quotes, refer to the PHP manual page at http://www.php.net/magic_quotes .
(external) system—be it the user’s browser, a MySQL database, or a web API. The purpose of escaping data is to keep the data intact. Because certain characters might be interpreted in special ways by the remote system, these characters can be escaped so that their original meaning is preserved. Examples of escaped charac-
62
SECURITY CORNER
Magic Quotes
ters are \' instead of ' for MySQL, and < instead of < for HTML. With magic quotes, escaping is performed on input before the first line of code. By the time you get the data, it has already been prepared for a very specific use—being sent to a database. There are many reasons why this is a poor approach, with three big ones being: 1) Escaping input makes filtering it more complex and error prone. 2) Escaping for one particular purpose limits what can be done with the data, without the added complexity of restoring the data to its original form. 3) The escaping performed by magic quotes is not thorough enough (for any system) to be considered a security safeguard. In “Magic Quotes and Add Slashes in PHP,” Harry Fuecks writes, “Never stripslashes! That’s the golden rule. You should never have to use stripslashes. Ever.” This is a strong point—if you’re having to take steps to restore data to its original form, then you’ve done something terribly wrong. In “Why Use Magic Quotes,” the PHP manual states, “Magic quotes are implemented in PHP to help prevent code written by beginners from being dangerous. Although SQL Injection is still possible with magic quotes on, the risk is reduced.” Reducing risk is what security is all about. However, is the risk of SQL injection really reduced? I think not. I’ve already mentioned the unnecessary complexity that magic quotes add to your filtering logic, but there’s the additional risk that magic quotes can make it more difficult to identify an SQL injection vulnerability. Because magic quotes can hide the obvious problems, such as when an apostrophe in someone’s name causes a database error, they promote a failure to properly escape data. In short, I believe an application is less secure with magic quotes enabled than when they are disabled, and this is independent of all of the other problems presented by the behaviour (see http://www.php.net/manual/security.magicquotes.wh ynot.php for a brief description about the portability,
performance, and convenience problems associated with magic quotes). Disabling Magic Quotes To disable magic quotes, make sure each of the directives is disabled. In php.ini, you can do this with the following: magic_quotes_gpc = Off magic_quotes_runtime = Off magic_quotes_sybase = Off
March 2005
●
PHP Architect
●
www.phparch.com
If your access to PHP configuration directives is limited to .htaccess, you can use the following instead: php_flag magic_quotes_gpc Off php_flag magic_quotes_runtime Off php_flag magic_quotes_sybase Off
In some shared-hosting environments, neither of these options may be available to you. I’ve been fortunate enough to have never been involved with any projects that can only afford shared hosting (thankfully, even my personal web site is finally being transitioned to a dedicated server). There are many reasons why you might find yourself developing an application that must be capable of being hosted on a shared server—it’s for a small company with a small budget, it’s an application that you want to distribute, or it’s an application for shared hosting providers. There is a good function for dealing with magic quotes at runtime in a NYPHP Phundamental located at http://education.nyphp.org/phundamentals/PH_stori ngretrieving.php. This function, fix_magic_quotes(),
takes into account each of the magic quotes directives and restores all data affected by them. This function needs to be called before any action, such as filtering, is taken on the data. And, of course, this should be considered a last resort. Until Next Time... I hope I’ve sufficiently convinced you to never develop with any of the magic quotes directives enabled. They are a horrible practice that no PHP developer should participate in. If you thought they were useful for security for some reason, hopefully you now realize that this is nothing more than a misconception. That’s all for this month’s Security Corner. Until next month, be safe.
About the Author
?>
Chris Shiflett is a frequent contributor to the PHP community and one of the leading security experts in the field. His solutions to security problems are often used as points of reference, and these solutions are showcased in his talks at conferences such as ApacheCon and the O’Reilly Open Source Convention, and his articles in publications such as PHP Magazine and php|architect. Security Corner, his monthly column for php|architect, is the industry’s first and foremost PHP security column. Chris is the author of the HTTP Developer’s Handbook (Sams), a coauthor of the Zend PHP Certification Study Guide (Sams), and is currently writing PHP Security (O’Reilly). As a member of the Zend Education Advisory Board, he is also one of the authors of the Zend Certification. In his spare time, he is leading an effort to create a PHP community site at PHPCommunity.org. You can contact him at [email protected] or visit his Web site at http://shiflett.org/.
63
You’ll never know what we’ll come up with next For existing subscribers
NEW
Upgrade to the Print edition and save!
LOWER PRICE! Login to your account for more details.
php|architect
Visit: http://www.phparch.com/print for more information or to subscribe online.
Your charge will appear under the name "Marco Tabini & Associates, Inc." Please allow up to 4 to 6 weeks for your subscription to be established and your first issue to be mailed to you.
Choose a Subscription type:
Canada/USA International Air Combo edition add-on (print + PDF edition)
*By signing this order form, you agree that we will charge your account in Canadian dollars for the “CAD” amounts indicated above. Because of fluctuations in the exchange rates, the actual amount charged in your currency on your credit card statement may vary slightly.
To subscribe via snail mail - please detach/copy this form, fill it out and mail to the address above or fax to +1-416-630-5057
EXIT(0);
HELP! I’m a PHP beauty stuck in the body of this Java programmer!
e x i t ( 0 ) ;
by Marco Tabini
ava developers are acting funny these days. They approach you on the street, sprinting out of a dark alley and whispering in your ear words like “Psst! Hey! I’m a Java developer. Yeah, you know, Java— that language developed to run fridges that somehow managed to make its way into the corporate IT world.”
J
“Say, I’ve heard about this script kiddie language called ‘PHP.’ It’s really not a professional programming platform like Java, but I’m finding it more and more difficult to convince my boss that he should invest in expensive Jsomething infrastructure instead of betting his money on PHP. Do you think you can give some tips I can use against it? Can you spare some change?” Jokes aside (I couldn’t resist), IBM’s announcement about the Zend Core platform (see this month’s What’s New) puts the final nail in the coffin into any doubts that anyone can have about whether PHP is ready for the enterprise. The naysayers still… naysay that this is nothing more than a ploy on Zend’s behalf to make its own what it hasn’t created. To them, I simply refer the immortal words of William Shatner: “Get a life!” Others think that IBM’s
announcement won’t make any difference as far as the PHP market goes. Perhaps—but it depends on what you consider the PHP market to be. On the surface, from a programmer’s perspective, the Zend/IBM announcement brings nothing new to the table. PHP was stable, is stable and will probably remain stable. The community needs neither Zend nor IBM to tell them that. From a business perspective, however, things change, and quite so. IBM is a heavyweight—and my experience with them is that they do not tend to do things just for the fun of it. Like they did with Linux (whose adoption by IBM sparked the beginning of its acceptance in the corporate world), they must have perceived a significant value in utilizing PHP as the driving force behind large-scale systems as opposed to Java. While I doubt that IBM’s consulting services arm will start going around offering PHP as a first-tier solution for their big clients, they’ll likely take advantage of its inexpensiveness when the bottom line calls for a more financially conservative development strategy. The importance here is in the fact that, as people who have been observing the corporate IT culture know, the focus today is on building systems that are reliable, safe, inexpensive and—especially—
primed for change. Faced with the choice between Java and PHP, both endorsed by what is possibly one of the most respected IT companies in the world, IT managers will, at the very least, be able to consider both technologies on a slightly more level playing field. I have seen plenty of people compare the Zend Core announcement to the one Sun and Zend made a while back about Java integration. Personally, I think that the two could not be farther apart. In the Sun announcement, PHP simply played second-fiddle to Java. The most creative among us might have interpreted it as the first signs that Java’s façade was beginning to crumble, but in all fairness there wasn’t much more to read in there than the fact that Java stayed on top of the food chain and PHP was allowed a bit part in the play. After all, what else should have we expected? Sun has a more-thanvested interest in ensuring that Java remains the king of the jungle, and nothing could be better than reducing PHP to the role of Cheeta, the colourful but harmless chimp. The Zend Core announcement, on the other hand, firmly projects the spotlight right on top of PHP. Whether this will make any difference or not, I can’t say—but I’m happy that it has happened, and look forward to seeing its effects. php|a
March 2005
●
PHP Architect
●
www.phparch.com
65
Can’t stop thinking about PHP? Write for us! Visit us at http://www.phparch.com/writeforus.php