This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Introducing the php|architect Grant Program As PHP’s importance grows on the IT scene—something that is happening every day—it’s clear that its true capabilities go well beyond what it’s being used for today. The PHP platform itself has a lot of potential as a general-purpose language, and not just a scripting tool; just its basic extensions, even discounting repositories like PEAR and PECL, provide a highquality array of functionality that most of its commercial competitors can’t afford without expensive external components. At php|a, we’ve always felt that our mission is not limited to try our best to provide the PHP community with a publication of the highest possible quality. We think that our role is also that of reinvesting in the community that we serve in a way that leads to tangible results. To that end, this month we’re launching the php|architect Grant Program, a new initiative that will see us award two $1,000 (US) grants to PHP-related projects at the end of June. Participating to the program is easy. We
invite all the leaders of PHP projects to register with our website at http://www.phparch.com/grant and submit their applications for a grant. Our goal is to provide a financial incentive to those projects that, in our opinion, have the opportunity to revolutionize PHP and its position in the IT world. In order to be eligible for the Grant Program, a project must be strictly related to PHP, but not necessarily written in PHP. For example, a new PHP extension written in C, or a new program in any language that lends itself to using PHP in new and interesting ways would also be acceptable. The only other important restriction is that the project must be released under either the LGPL, the GPL or the PHP/Zend license. Thus, commercial products are not eligible. Submit Your Project Today! Visit http://www.phparch.com/grant for more information
TABLE OF CONTENT
php|architect Departments
Features
INDEX
12 5
EDITORIAL RANTS
7
NEW STUFF
Shell Scripting with PHP (PHP CLI) by Jayesh Jain
19
FreeTrade, e-commerce for developers by Vladan Zirojevic
8
REVIEW TUTOS
32 29
REVIEW
Blazing Site Performance Using Objects and Sessions by Peter Moulding
phpLens
67
TIPS & TRICKS
41
Writing an RSS Aggregator With PHP by Marco Tabini
49
Exploring XSLT Processing Options Within PHP
by John Holmes
70
BOOK REVIEWS PHP Professional Projects Professional PHP Web Services
by Stuart Herbert
72
exit(0); Let Me Throw The First Stone
58
The Story of theBouncing Ball and XML by Sam Smith
March 2003 · PHP Architect · www.phparch.com
3
The designers of PHP offer you the full spectrum of PHP solutions
Serve More. With Less. Zend Performance Suite Reliable Performance Management for PHP
Visit www.zend.com for evaluation version and ROI calculator
Technologies Ltd.
EDITORIAL RANTS
EDITORIAL Awwww Yeah....
These were the first words I uttered after accepting an invitation to take the reins as editor-in-chief for php|architect magazine. About 3 weeks later, Marco, our fearless publisher, reminded me that I still had to write the ‘Editor’s Rants’ aritcle. This is essentially what you see at the beginning of seemingly every magazine ever published. ‘Letter from the Editor’, ‘From the Editor’s Desk’. Whatever you want to call it – it’s all the same – and as far as I can tell, it’s all pretty meaningless. Over the past few weeks I’ve read no fewer than 30 of these articles, trying to get a clue as to this article’s purpose in life. I read magazines covering every conceivable topic, formal and informal, from the politically perfect to the underground, grass-roots publications and found that these editorial columns all seem to accomplish a single goal: to make the editor look like the most pretentious, micro-managing, self-important species ever to roam the earth. Did these people lack attention as children or something? The more I read, the more I came under the suspicion that the species I had been recruited to emulate might be... well... not my cup of tea. I also couldn’t help but ask myself out loud on several occasions, ‘who reads these things?’ But alas, this is not meant to be an editorial on editorials (or a ‘metatorial’ if you will). If I’m forced to partake in this charade, then I will endeavor to make it something useful. It so happens that, being my first month at the helm, I have plenty to share with whoever is reading this, specifically with regard to my vision for php|a over the coming months. I do welcome comments on all of this, by the way: [email protected] I’ll break my thoughts down into sections covering our current status, the near future, and the longer haul. Where We Are
From rather humble beginnings, I think php|a has accomplished much with regard to the vision of its creator – Marco Tabini. This vision was twofold, so I’ll cover them seperately. First, php|architect has become known as a reputable resource for well written, well edited documentation covering all aspects of PHP development. While I will not bore you with the laborious details of how challenging it can be to combine
Editorial Team Arbi Arzoumani Brian Jones Peter James Marco Tabini
Graphics & Layout Arbi Arzoumani
Administration Emanuela Corso
Authors Stuart Herbert, Jayesh Jain, Peter Moulding, Dave Palmer, Sam Smith, Marco Tabini, Vladan Zirojevic
php|architect (ISSN 1705-1142) is published twelve times a year by Marco Tabini & Associates, Inc., P.O. Box. 3342, Markham, ON L3R 6G6, Canada. Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes no responsibilities with regards of use of the information contained herein or in all associated material.
EDITORIAL editorial precision with technical savvy, I will say that this process and this accomplishment both fall under the heading of ‘non-trivial’ tasks. Certainly a very large thank you is in order for each of the many authors who collaborate with our editorial staff each month. Working together, we have been able to overcome challenges ranging from differences in opinion to differences in time zone, culture, and even language. Their time, patience, and hard work is very much appreciated. Second, php|architect continues to make great strides in establishing itself as a voice for the PHP developer community, as well as an advocate for the deployment, evolution, and progress of the PHP platform. We haven’t sought to glorify the state of PHP, nor to shun its more proprietary peers. Instead, we try to maintain a realistic focus on the use of PHP in production environments. We also take a decidedly optimistic view of the future of PHP, which at times finds itself at odds with what some of the larger names in community might have in mind. Nevertheless, we feel that the opinions of our authors, the editors, and the community are an important part in shaping any project which lists ‘open development’ as a goal. The Months Ahead
The months ahead will continue to build upon the foundation laid in the first months of our existence. The main goals in the immediate future focus largely on quality and efficiency – in other words, bringing you even better quality content whithout breaking our necks or collapsing from exhaustion in the process. To that end, there have already been improvements, tweaks and hacks put in place. First, we’ve convinced a one-time volunteer editor, Peter James, to come on staff as a full-fledged editor. Peter has proven himself to be a tireless worker and a great collaborator. He has already done wonderful work here, and he will undoubtedly leave an indelible mark upon the pages of php|architect over the coming months. Second, as further proof of our commitment to the community at large, we are very proud to announce this month the launch of a grant program to aid fledgling (or not-so-fledgling) projects to continue to progress and bring PHP into the places where no man has gone before – or at least the road less traveled. See this months pages and
March 2003 · PHP Architect · www.phparch.com
the php|architect website for more details. The Longer Term
Looking out and attempting to predict our future at such an early stage, while fun, is probably also an exercise in futility. All that can be said for sure is that there are plans of great size and great number. More huge ideas are born every day. Surprisingly, only some of the ideas belong to us ‘staffers’. The rest come from you! So far, most of the changes and adjustments we’ve made, and some which we’re working on, have come directly from our readers. We’re all constantly monitoring emails, looking in the php|architect forums, and even monitoring the PHP mailing lists in search of yet another unfulfilled need that we might be able to lend a hand to. Please keep these requests coming! There is hardly a greater compliment that can be paid to a publication than constructive feedback. In Conclusion
So now that I’ve firmly planted the image in your head of an SUV-driving, tie-wearing, cleanshaven, pretty boy editor sitting in his overstuffed couch with a $4.00 cup of coffee and a laptop, let me assure you that while I take the publication and my work seriously, I don’t take myself all that seriously at all. The reality is that I have hardly a clue how I got here. I can only be thankful for Marco’s obvious and complete insanity in choosing me for this post. I am grateful both to Marco for this opportunity, those authors with whom I’ve had (and continue to have) the distinct honor of collaborating, and the many readers who have offered their feedback and encouragement. Until next month...
6
NEW STUFF
NEW STUFF
PHP 4.3.1 Released The PHP Group has announced the release of version 4.3.1 of the PHP interpreter. The new release addresses a bug in the CGI version of the interpreter that invalidates the effectiveness of the —enable-force-cgi-redirect compile-time switch. This, in turn, makes PHP-CGI susceptible to outside hacking attacks that could result in the execution of arbitrary PHP code. For more information, visit http://ca.php.net/release_4_3_1.php ADODB 3.20 Available A new version of ADODB, the popular and efficient database abstraction library, has been released by its maintainer, PHPEverywhere blog author John Lim. ADODB 3.20 supports several new features, including abstracted capabilities for creating tables and indexes, although this functionality is still considered in its alpha stage. According to its website, available at http://php.weblogs.com/adodb, ADODB is twice as fast as PEAR-DB and 10% faster than PHP-Lib
new features, such as CVS integration, advanced project management capabilities and improved performance on all platforms. The Zend Studio 2.6 is priced starting at $195 (US). More information is available from the Zend website at http://www.zend.com. Introducing the php|architect Affiliate Program Earlier this month, we proudly announced the introduction of our new affiliate program, which pays a commission for each purchase made through our website by visitors referred from on of our partners. Participation in the program is free, and open to all websites, without any minimum requirements. You can find more information on the php|a website at https://www.phparch.com/afflogin.php. PHP Conference in Montréal
Nova: A P2P Client In PHP The Nova Project has released Nova, a peer-topeer application compatible with the popular GNUtella file-sharing network. Nova is written entirely in PHP using the PHP-GTK extension, and provides an excellent example of how PHP can be used to develop application outside the Web space. Nova, which is based on the GnucDNA library, currently supports only basic functionality and is only compatible with Windows. You can find more information at the project’s homepage (https://sourceforge.net/projects/novap2p/).
PHP Québec will hold their first PHP conference in the city of Montréal on March 20 and 21. Speakers at the event include a who’s who of the PHP community, including Zeev Suraski, Andrei Zmeivski and Rasmus Lerdorf. php|architect will also be there and our own Marco Tabini will give a presentation on PHP-based business and our experience in the world of electronic publishing. For more information, you can visit the conference’s web site at http://phpconf.phpquebec.com/.
Zend Releases Studio 2.6 PHP powerhouse Zend Technologies has release version 2.6 of their Zend Studio IDE. The new application includes several bug fixes, as well as
March 2003 · PHP Architect · www.phparch.com
php|a
7
REVIEWS
Reviewed For You
TUTOS
REVIEW
The Ultimate Team Organization Software By Marco Tabini
A
few days before writing this review (which, at the time, was another review), I had something of an epiphany. I was on the phone with a client, boasting about how organized I am, when Arbi walked by my office and laughed a typical “I know better” at me. My ego was bruised, so I wrote down a list of ways that I am organized on a piece of paper. I’d gladly show you that piece of paper if I hadn’t, er, temporarily misplaced it. As the saying goes, if you can’t blame anyone but yourself, blame the process. After all, a person can only do so much on his own. I set out to find a decent groupware application that would allow me to get more organized and, at the same time, better manage the office’s internal processes (formatting Arbi’s hard drive would have been a bonus, but I couldn’t find any applications capable of the perfect union between a calendar and that). After a bit of searching, I came across TUTOS, which stands for The Ultimate Team Organization Software. Created by German developer Gero Kohnert as an internal application for his previous employer, TUTOS has evolved into a complex groupware application that is used by the likes of Siemens. Best of all, it’s written entirely in PHP.
March 2003 · PHP Architect · www.phparch.com
The Cost: Free (released under the GPL)
Requirements: Apache You wil need one of these Databases PostgreSQL Database MySQL Database (MySQL-3.23.21-1.i386.rpm) Oracle Database Borland Interbase 5 PHP 4 (minimum php4.1)
Developer Background: “I started a first TUTOS like system for my former company back in 1997 or so. They still use it alot to have an overview of all their customers , software installations and different products and to support the internal Quality Managment (ISO9600). After leaving this company I started TUTOS, an enhanced system based on the same thoughts with a lot more features. On TUTOS I'm working now for more than a year and after making a first installation in my department at my current employer I think it is time to release it to the public, giving something back to the Open Source Community.”
8
REVIEWS
TUTOS
Groupware, Anyone? TUTOS is a complete groupware application in every sense. It includes a calendar, a contact management system, a bug tracking system, a product/project repository, mail capabilities, a time tracking system, an invoicing system, and much more. The system is also multilingual, supporting about fifteen languages right out of the proverbial box. For companies whose groups work in different parts of the world, TUTOS includes the ability to specify timezone information for each user profile, so that everything remains properly synchronized and meaningful to all users. Finally, the system includes a “watchlist” mechanism that makes it possible to remain up-to-date, via e-mail,
on changes to systems such as the bug tracking database or the calendar schedule. Even for a small workgroup, this “active notification” approach is a very important feature, particularly when the members of a team do not all work in the same office. User Interface and Security Features The system implements a very fine-grained permission system, thus ensuring that the administrators have every way to control access to each function. In addition, a TUTOS administrator has prompt access to all of the security settings, including his/her own. TUTOS features a colourful and easy-to-use interface that is sure to please both lovers and haters of GUIs. As
Figure 1
March 2003 · PHP Architect · www.phparch.com
9
REVIEWS
TUTOS Figure 2
you can see in Figure 1, only a minimal amount of information is kept on the left-side frame menu, leaving as much real estate as possible available for the system’s actual functionality.
TUTOS doesn't use any proprietary components that are tied to a particular architecture. Documentation, Interoperability and Limitations It’s interesting to notice that the ultimate goal of the TUTOS project is to create a portable groupware system whose interface can be written for a number of different platforms. As such, the development team has put a large amount of work into the design of the underlying data structures themselves. Naturally, that’s good news because, in addition to the PHP interface, other interfaces are likely to be developed. In fact, a KDE/Gnome version is currently in the works. From a portability perspective, TUTOS’ PHP interface doesn’t use any proprietary components that are tied to a particular architecture, and it supports several different database systems, including MySQL, PostgreSQL and Oracle. In my opinion, the system could use a bit more work as far as interoperability is concerned. In particular, I
March 2003 · PHP Architect · www.phparch.com
think that attention should be paid to integration with outside services like LDAP and/or Microsoft Exchange. This could help make TUTOS a more appealing solution for larger organizations. Since the entire system is based on PHP, however, it should be relatively simple to expand its functionality to include whatever elements one needs. The code is quite clean and well documented, so it forms a good foundation on which to build a more complex and specialized application. TUTOS includes a good deal of documentation. For the developer, the TUTOS website provides a detailed analysis of all the data structures and logical organization of the system, as well as an API reference and an installation guide. As for the end user, they get a complete contextual help system, shown in Figure 2, that includes an accurate description of what each individual screen does. Conclusion Now that I have TUTOS running on my computer, I can say with confidence that I feel more organized. As with all groupware systems, the trick to getting TUTOS into an organization is to stimulate user acceptance. After all, your average user is normally not keen on trying new things, and prefers instead to stick with the application that he or she has learned to use well. TUTOS represents an excellent groupware solution that is easy to learn and offers a wide array of functionality. It is organized in a logical fashion, provides lots of documentation, and its PHP codebase makes it easily extensible. php|a
10
FEATURES
FEATURES
Shell Scripting with PHP (PHP CLI) By Jayesh Jain
PHP 4.3.0's new CLI SAPI improves greatly on the foundation laid by its CGI ancestor. Take it for a spin – no web server required!
Introduction With the introduction of version 4.2, PHP started supporting a new SAPI (Server Application Programming Interface) called CLI (Command Line Interface). This facility was introduced to help developers create small shell applications with PHP that have no dependencies on a web server. In version 4.2.0, the CLI SAPI was experimental, and had to be explicitly enabled using the —enable-cli option when running ./configure. However, with PHP 4.3.0, the CLI SAPI has been deemed stable and is therefore enabled by default. It can be explicitly disabled by specifying the —disable-cli option to ./configure. Though it was technically always possible to create independent shell-style scripts with PHP using a standalone CGI interpreter, there are a few things that make the new CLI SAPI an especially attractive and unique tool: - Unlike the CGI SAPI, no headers are written to the output. Though the CGI SAPI provides a way to suppress HTTP headers, there’s no equivalent switch to enable them in the CLI SAPI.
March 2003 · PHP Architect · www.phparch.com
- The CLI starts up in quiet mode by default, though the -q switch is kept for compatibility so that you can use older CGI scripts. - The CLI does not change the working directory to that of the script. (-C switch kept for compatibility) - Three words: Plain text errors! (no HTML formatting). In this article, we’ll discuss how to use PHP’s CLI feature to exploit the power of PHP from the command line. We’ll assume that you have a fair understanding of PHP and that PHP is installed and working properly on your computer. Most of the examples in this tutorial were tested on a Windows platform (w2k, to be precise), but should work unaltered on a Linux box. REQUIREMENTS PHP Version: 4.3 and Above O/S: Any Additional Software: N/A
12
FEATURES
Shell Scripting with PHP (PHP CLI)
What is a PHP Shell Script?
Getting Started
The term ‘PHP shell script’ can cause some confusion for those who are used to traditional UNIX-style shell scripts. The reason for the confusion is that PHP, strictly speaking, does not provide a traditional interactive environment for command execution. Another source of confusion is the fact that the name of the SAPI – ‘CLI’ is also an acronym used to refer to a UNIX shell, which by definition is an interactive command execution environment! Let’s clarify this right now by simply stating that the PHP CLI is an interface to a PHP interpreter that can be accessed via the command line - not an interactive shell. In reality, ‘PHP shell scripts’ are just PHP scripts that can be run from a command line or a cron job (which will be discussed later in the article) in the same fashion as a shell or PERL script. A normal shell script consists of two main parts: the very first line, which indicates the interpreter that should be parsing the script, and the code to be executed by the interpreter. Normally, the interpreter that is specified is a UNIX shell, or some other interpreter, like PERL. However, since the creation of a PHP interpreter which can be executed directly from a shell (without the help of a web server), we can specify PHP as the interpreter for our script, and then fill the file with PHP code. For all you Windows users, PHP shell scripts are just like batch (.bat) files in MS-DOS, but with the power of the PHP programming language.
Let’s start with a small script (the most familiar one) which simply prints the words “Hello World” to the screen. Using your favorite text editor, create a text file called “world.php” in your PHP folder and enter the following text.
Finding Your PHP CLI Output on my Windows machine of the php.exe in the ‘cli’ folder was: PHP 4.3.0 (cli) (built: Dec 27 2002 05:34:00) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.3.0, Copyright (c) 19982002 Zend Technologies Output on my machine of php.exe in the php root folder was:
Save the file, ensure that you have the PHP CLI executable in your path, and run the following command > php world.php Surprised to see the output, ‘Hello World’ in the command prompt instead of the web browser? Welcome to the other dimension of PHP! Note: If you are using PHP version 4.2 (or the CGI Version) you may have noticed that the following header is also in the output: X-Powered-By: PHP/4.2.3 Content-type: text/html The PHP CGI version does that by default, which can serve as an indicator that you’re using the CGI (in addition to cluing you in on the PHP version you’re running). To suppress the HTTP headers and follow along with the remaining examples, run the PHP CGI with the ‘-q’ command line flag. > php -q world.php Using the ‘-r’ option, you can also pass the PHP code to be executed to the PHP CLI directly from the command line. For example: > php -r ‘echo “Hello World\n”;’
PHP 4.3.0 (cgi-fcgi), Copyright (c) 1997-2002 The PHP Group Zend Engine v1.3.0, Copyright (c) 19982002 Zend Technologies
Will output ‘Hello World’, the same exact message as our earlier example. This is fine for small bits of code you only want to execute once. Only as the code gets larger will you need the power of a script.
This would indicate that the ‘php.exe’ file on my machine in the php root folder is actually the old CGI. For the reasons mentioned earlier comparing the CGI and CLI interpreters, and because the title of this article says ‘CLI’ in it and not ‘CGI’, I’ll be refraining from using the CGI in the examples going forward. I just wanted to point out a possible point of confusion here... so on we go!
You can redirect the output from any script to a file by running:
March 2003 · PHP Architect · www.phparch.com
> php world.php > outputfile Linux and UNIX users can also redirect the script output to another command by using the | (pipe operator). For example:
13
FEATURES
Shell Scripting with PHP (PHP CLI) Getting Interactive
> php world.php | wc -w Will return the word count of the output (in this case, it will output the number ‘2’). Conversely, you can also send output from some application or command to the php interpreter by using the pipe operator, for example: > someapp | php This might be useful if ‘someapp’ outputs PHP code which can be immediately executed by the interpreter instead of adding the overhead of first writing to a file and then calling the interpreter separately – which will then also be subject to the added overhead of opening and reading a file from disk.
PHP has always made it very easy to interact and trade data with a user via the traditional browser interface. Let’s have a look at how PHP’s ‘stream’ functions and associated constants can extend this ease to the command line interface. There are three streams available in PHP CLI. If you’re familiar with the corresponding standard UNIX device names, you’ll be right at home here, as these streams emulate the same functionality. stdin (‘php://stdin’) stdout (‘php://stdout’) stderr (‘php://stderr’) The following example will display “Hello World” in the output window using the output stream.
Why Do You Need A PHP Shell Script? - Shell scripts can take input from, and send output to a user via STDIN/STDOUT, a regular text file, or even another command. In other words, they’re very flexible.
This time, we’ll demonstrate how to use an input stream. It will accept an input from the user, wait for the user to press the Enter key and then display the entered text.
- Shell scripts are a ‘quick and dirty’ way to create your own commandline tools and applications. - If you already use PHP for web development, why learn another language to write shell scripts?
?>
The following example shows you how to output text to an error stream.
- Shell scripts are commonly used to automate day-to-day administration tasks. Figure 1
March 2003 · PHP Architect · www.phparch.com
14
FEATURES
Shell Scripting with PHP (PHP CLI)
Constant Details
?>
To make accessing and moving data around in a shell environment simpler, PHP has added the following constants. Again, for UNIX users, these names should look familiar. In UNIX, these typically map to the keyboard (STDIN) and the screen (STDOUT and STDERR) by default. Using Arguments in Scripts Anyone who has experience writing shell or PERL scripts would probably consider the PHP CLI completely useless if the scripts weren’t able to handle arguments passed to it from the command line. The CLI handles this with the use of the $argv and $argc variables. All of the arguments passed to your script are stored in a zero based global array variable $argv. Another global variable, $argc, holds the number of arguments passed to the script (everyone coming from a C/C++ background, should be familiar with this practice). Here is the code, which displays the total number of arguments passed as well as the arguments:
STDIN An already opened stream to stdin. You don’t have to open it with: $stdin = fopen('php://stdin', 'r'); STDOUT An already opened stream to stdout. You don’t have to open it with: $stdout = fopen('php://stdout', 'w'); STDERR An already opened stream to stderr. You don’t have to open it with: $stderr = fopen('php://stderr', 'w'); Here is an example of how to use a stream with constants: For users of version 4.2, you’ll need to add these additional lines to the top of your script in order for the above example to work: define('STDIN',fopen("php://stdin","r")); define('STDOUT',fopen("php://stdout","w") define('STDERR',fopen("php://stderr","w")
Figure 2
March 2003 · PHP Architect · www.phparch.com
15
FEATURES Assuming this is stored in a file argument.php you could test this script by running: > php argument.php arg1 arg2 Refer to Figure 2 for sample output Important : You can use $argc and $argv in the manner described above only if you have both the register_globals and register_argc_argv settings in php.ini set to ‘on’. As of PHP version 4.3, register_globals is set to ‘off’ by default. You should do your best to write your scripts so that they do not require register_globals to be on, to avoid possible security issues in your incoming form variables. You can still use $argv and $argc if register_globals is set to ‘off’. Just use the following code: $argc = $_SERVER['argc']; $argv = $_SERVER['argv'];
Using External commands in Scripts In order to run any external command from the script, we will have to use the php function shell_exec. Shell_exec will execute a command via the shell and return complete output as a string.
Shell Scripting with PHP (PHP CLI) Also, don’t forget to add the PHP tags in the script file, otherwise PHP will not interpret it properly. If you’re stuck with the CGI version and want to suppress the HTTP headers, use #!/usr/bin/php -q. Similarly, you can also use any other PHP arguments. Scheduling PHP Scripts PHP scripts can be scheduled to run automatically with the help of ‘cron’. Cron is the name of the standard execution scheduling mechanism that enables UNIX users to execute commands or scripts automatically at some regular interval. A common use for it is to backup all of the file on the server, do some database cleanup, or send emails to administrators about the state of the system. Try typing ‘man cron’ or ‘man crontab’ on your Linux system to learn more about how to get your PHP scripts to run at regular intervals. This can be an extremely valuable tool to developers who need to do major data crunching without the user in the browser being adversely affected. For example, while it’s possible to have a user click a button which triggers the collection of live network data coming from remote LDAP, SNMP or other sources, grabbing and manipulating this code to make it suitable for presentation can really take some time. Why not poll these sources in a cron job at regular intervals, store everything to a database in a more friendly format, and have the front end simply grab from the database in a more ‘bite size’ format. The possibilities are endless!
For Linux/UNIX Users As we’ve discussed, the PHP executable runs independently of the web server. An added benefit of using this interface for your scripting is that it makes your scripts more portable than using, say, ‘/bin/bash’. Using ‘#!/usr/bin/php’ at the top of your script will allow you to invoke the PHP interpreter on a Linux machine, and on Windows, this line will be skipped over to get to the PHP code below. Now you can write in your familiar Linux environment, and still deploy to Windows. Tip: Don’t forget that Linux won’t let you do anything until the script has its ‘execute’ bit set. Assuming your script is called ‘testscript.php’, you can just ‘cd’ to the directory where the script lives and type: > chmod +x testscript.php Once you have set the executable attribute, you can execute the file like this:
...using PHP's CLI interface makes your scripts more portable than using, say, '/bin/bash'. For Windows Users: Configuring Windows to Execute PHP Scripts To run PHP scripts on your Windows machine, you’ll need to associate the PHP files with the PHP interpreter. To do this, open Windows Explorer, click on the tools menu and select ‘folder options’. Click on the ‘File Types’ tab and select the ‘New’ button. Type ‘.php’ in as the file extension, and click OK.
> ./testscript.php March 2003 · PHP Architect · www.phparch.com
16
FEATURES
Shell Scripting with PHP (PHP CLI) Refer to Figure 3
You can also register files with extension ‘.php3’ or ‘.php4’ in the same fashion mentioned above.
Select the PHP entry in the ‘Registered File Types’ list box, click the ‘Advanced’ button, click ‘new’ and type ‘Run’ in the Action box. In the ‘Application Used to Perform Actions’ box, type C:\PHP\PHP.exe “%1” %* (change the PHP path if it’s different on your machine. Note that ‘%*’ is used to send any command line arguments). Click ‘OK’, ‘OK’ again and then the ‘Close’ button. Your Windows machine is now configured to run PHP Scripts. Just double click on any PHP file in Windows Explorer to run it.
More Information There are certain php.ini directives, which are overridden by the CLI SAPI because they do not make sense in shell environments: a) html_errors = FALSE As the output of the PHP CLI does not go to a browser, there is no need to echo HTML tags hence this directive defaults to FALSE.
Refer to Figure 4 Figure 3
March 2003 · PHP Architect · www.phparch.com
17
FEATURES
Shell Scripting with PHP (PHP CLI)
b) implicit_flush = TRUE All the output coming from output commands need to be shown instantly and should not be cached. As a result, this directive is defaulted to TRUE. c) max_execution_time = 0 (unlimited) Because the PHP runs in the shell environment and often does tasks which take a longer time, the max_execution_time is defaulted to unlimited.
Using the MySQL Database in Your Scripts You can connect to MySQL database exactly the same way as you do in your normal php scripts except you should not use HTML tags as the output is not sent to the browser. Here is a simple example, which connects to a MySQL database and lists the username field for all records in the user table.
d) register_argc_argv = TRUE This setting is TRUE so that you can always have access to the number of arguments passed to the script ($argc) and array of the actual arguments ($argv) in the CLI SAPI. Extensions Now that you know all about the PHP CLI, use the following code examples , or make up some of your own, to test the power of PHP and the CLI.
Conclusion Sending Email in Scripts The syntax used to send email is shown below. You can skip the header part, as it is an optional parameter. \r\n"; $headers .= "To: Administrator \r\n"; $headers .= "Reply-To: >\r\n"; $headers .= "X-Priority: 1\r\n"; $headers .= "X-MSMail-Priority: High\r\n"; $headers .= "X-Mailer: PHP on someserver.com"; mail($useremail, $subject, $message, $headers) ?>
In this article we have tried to show you the power of the PHP CLI interface. I am sure you are now convinced of PHP’s flexibility, and urge you to expand upon the concepts discussed here. Certainly, this is an area that will likely receive much attention as PHP evolves into a more complete and mature language. Enjoy!
About The Author
?>
Jayesh Jain is working as a consultant in Auckland, New Zealand. He has several years of n-Tier development experience and is currently working with PHP to develop interactive client solutions. He has a passion for Web development and in the spare time he likes to write articles. Contact him at: [email protected]
Click HERE To Discuss This Article https://www.phparch.com/discuss/viewforum.php?f=6
Figure 4
March 2003 · PHP Architect · www.phparch.com
18
FEATURES
FEATURES
FreeTrade, e-c commerce for developers By Vladan Zirojevic A package aimed at the developer, FreeTrade offers a cost-effective yet robust framework for creating your next online superstore. This article will show the power and flexibility of FreeTrade, and follow up with a short Q&A with FreeTrade creator and author of 'Core PHP Programming', Leon Atkinson.
E
-commerce applications are one of the most common requests web developers receive today. An ecommerce solution should provide two different types of functionality. Site visitors should see a fully-featured online shop, while business owners should have an easy-to-use back end administration system. There are many open source e-commerce solutions with many good features, and I have tried almost all of them. Specific to PHP there are a few cool scripts which will help you accomplish everything needed to set up an online store and to maintain it easily. Applications like osCommerce (http://www.oscommerce.com), FishCart (http://www.phpshop.org), and phpShop (http://www.fishcart.org) are just a few examples. How can you select a package to use as your e-commerce ‘solution of choice’? It’s not easy. You must ask yourself what you are looking for and identify the problems which need solving. In my case I was looking for a solution which met the following requirements:
FreeTrade on the Web The Cost: Free (released under a BSD-style license
Home Page: http://share.whichever.com/index.php?SCREEN=freetrade
Mailing list archive: http://share.whichever.com/pipermail/freetrade-dev/
Demo store: •open source •well written • fully-featured •not in an early beta version •supported by the community •very configurable March 2003 · PHP Architect · www.phparch.com
REQUIREMENTS PHP Version: PHP 4.1.x, MySQL O/S: Any Additional Software: N/A
19
FEATURES
FreeTrade, e-commerce for developers
These requirements are not too strict, but some of them do exclude a number of the solutions out there. After searching, configuring, testing, customizing, and a good many sleepless nights, I believe that I have found a good, although not well-known, solution for a developer. FreeTrade is a fully-featured and highly configurable PHP script for creating and maintaing e-commerce Web applications. FreeTrade is not a very well-known script, and there are a few reasons for this. For example, it does not have its own domain name, the demo store is very poorly designed , and FreeTrade uses a different (and, at first glance, strange) development model called FreeEnergy. FreeEnergy FreeEnergy is not a new concept in PHP development, but it is also not a widely used one. I am sure that you use the include() function in your PHP scripts frequently. Some of the more common uses are for headers, footers, and function libraries. We had this functionality before PHP, too, with Server-Side Includes. This include() functionality is the core of the FreeEnergy concept. FreeEnergy uses the idea of ‘modules’. These modules are a set of directories, generally kept outside of the web directory tree for security reasons. Almost all FreeEnergy code is kept in these modules. Module files are PHP scripts that are included when needed. Some standard modules are layout, navigation, action, utility and screen. These directories contain code specific to their function, and will be examined in greater detail in a moment. A FreeEnergy application has only one PHP script inside the web directory, named index.php. This is the controlling code for the application, and it loads modules when needed. It is a small script, and we will discuss it in the upcoming example. As an example of how the FreeEnergy concept works, we’ll create the layout of an e-commerce site. Design and content are elements of every page in a site. While the design can be almost the same for groups of pages, the content will be different on every page. Since mixing design and content is never a good idea, we should try to keep them separate. Usually we have a few design schemes for a site. For example, we could have different schemes for the home page, department and product pages, news pages, and pop-up pages. The FreeEnergy concept calls each scheme a layout. You may have as many layouts for a site as you need, and these are housed in the layout module directory. Each layout consists of one or more components. For example, one layout might have a header, content, and a footer as components. Another layout might have a header, a left-side navigation bar, content, and a footMarch 2003 · PHP Architect · www.phparch.com
er. These components can be laid out in the layout file using HTML tables, div tags, or any other means. Note that none of the components of a layout are hardcoded. They are just included into it with a simple include(). Each component is saved in a separate file. The code making up each of the navigation, header and footer components is kept in the navigation module directory. The content component (or screen) is kept in a file in the screen module directory. A screen may be pure HTML or PHP, but the screen file never performs any action past showing page content. Listing 1 lists the code for an example screen showing a user login page. This example login screen script would reside in the login file in the modules/screen directory. Module files have no file extension (like .php) because they are to be included, not executed. I will stay focused on this example and explain a few things. FreeEnergy has a general module, called utility, which contains common functions and libraries. Some of these files are included in every FreeEnergy page. In Listing 1 we used functions such as StartForm(), startTable(), and printLink(), which are functions from the utility module. FreeEnergy doesn’t force you to use these functions (you may use print(“
”) instead of startTable()), but using them ensures that you can control all site elements from single functions. This means that the StartForm() function controls all site forms. This is great because we can automatically add as many hidden variables to all of the forms as we need. Site changes are very easy with this approach! Let’s look at the definition of the StartForm() function and compare it with the function call in our example screen in Listing 1. function StartForm($screen, $method=’get’, $action=’’, $secure=FALSE, $extra=’’, $windowName=’’, $encodingType=’’, $formName=’’, $onReset=’’, $onSubmit=’’, $class=FALSE);
The $screen parameter defines which screen will be loaded when the form is submitted. $method specifies whether the GET or POST form method is to be used. $action names the action to be executed before the next screen is shown. $secure defines whether we will use HTTP or HTTPS for this form. These function parameters are the most important. For our purposes, the remaining parameters may be ignored. The action is a file in the modules/action directo-
ry which will be included and executed before the target screen is loaded. Actions never print text. They only take action and return a result to the $ActionResults array. Some actions are very simple (ie. add a news item to a site), but some are very complex and require several steps (ie. submit order). Each step of an action must completed successfully for the action to be successful. In our example, the target screen is ‘contents’ (which will show the shopping cart contents), and the action script is ‘LOGIN_USER’. When the form is submitted it will load and execute the action script. If the action completes successfully, it will then transfer the user to a page showing the contents of his shopping cart. Now that we know how FreeEnergy works, let’s take a look at how FreeTrade builds on it. FreeTrade Organization FreeTrade, a child of the FreeEnergy concept, follows the same code organization we just described. The directory structure is as follows:
March 2003 · PHP Architect · www.phparch.com
·htdocs/ index.php ·modules/ action/ configuration/ database/ help/ language/ layout/ navigation/ screen/ utility/ We already know the purpose of the general FreeEnergy modules like layout, navigation, screen and action. Let’s examine the new FreeTrade-specific modules. The configuration module contains scripts for specifying application parameters. The well-commented modules/configuration/global file defines all important global constants. I will discuss this file in detail later.
21
FEATURES The modules/configuration/screenInfo script describes every page (page=screen) of the site. The default screen definition is as follows: $ScreenDefaults = array( ‘SI_TITLE’=>’Page title...’, ‘SI_DESCRIPTION’ =>’Desctiption...’, ‘SI_KEYWORDS’=>’Keywords...’, ‘SI_LAYOUT’=>’with_side_nav’, ‘SI_BODY’=>’style=”background: white;”’, ‘SI_PERMISSION’=>FALSE);
‘SI_LAYOUT’ defines the layout to use for this screen. ‘SI_PERMISSION’ is an array of user permissions needed to view the page. For instance, to limit page access to the administrator, we would use: ‘SI_PERMISSION’=> array(‘Administrate’)
All screen definitons are stored in an array called $ScreenInfo. Screen definitions don’t usually specify all of the parameters. Any parameters that are omitted will be filled in from $ScreenDefaults. An example of an administrative section for adding a new product into the store database is as follows: “add_item”=>array ( ‘SI_TITLE’=>”Add item to Catalog”, ‘SI_PERMISSION’ =>array(‘Administrate’))
This page will have a default description, default keywords, and default navigation (‘with_side_nav’). An example of a screen with a different layout is as follows: “help”=>array (‘SI_TITLE’=>”Help”, ‘SI_LAYOUT’=>”plain” )
In this help screen we defined a different layout, but omitted permissions. This means that the default permissions will be used (everybody will be able to open the page). The database module is a kind of abstraction layer for MySQL and PostgreSQL databases. Which of them to use is up to you. The database type is set in the modules/configuration/global file, and the database module contains function libraries for database operations. The help module is a set of screens to create contextsensitive help. The language module is used for localization. FreeTrade may be totally localized and translated to any language. By default, it supports English, Italian, German, French and Spanish. Now let’s see how to install and configure the script.
March 2003 · PHP Architect · www.phparch.com
FreeTrade, e-commerce for developers Configuration and Installation FreeTrade is easy to configure, but there are a few things which may cause problems. FreeTrade requires at least PHP 4.2. Take a look again at the FreeTrade directory structure. The modules directory is parallel to htdocs. This implies that modules has to be outside the web directory. If your host doesn’t allow such a configuration, it can be a security problem. You can place the modules directory within htdocs, but the module files don’t have .php extensions. This means that they may not be parsed by PHP and everyone will be able to see your configuration parameters. Although you can protect this directory, the only secure configuration is to place modules outside of the Web directory. After you install the code, you will need to make a database for FreeTrade. In the FreeTrade distribution package, you will find an install directory with SQL files for both MySQL and PostgreSQL. Load the desired database type’s build.sql into a new database. You may also load sampledb.sql into the database to populate it with sample data for testing. The last installation step is to modify the modules/configuration/global file. You will probably need to change only database-related parameters, but I will explain some other interesting parameters, too. The first section of this file defines the debug status and the behaviour of error logging. You can set the ‘DEBUG’ constant to ‘TRUE’ for site-wide debugging, but don’t forget to turn it off once the site is live. Also, make sure that the error log directory you select in ‘LOG_DESTINATION’ exists and is writable by the web process. The next section of this file is for configuring global network behaviour. Setting ‘USE_FLAT_URLS’ to ‘TRUE’ will turn on flat URLs (which will be decoded in index.php). Flat URLs use a sneaky technique to pass data in like a querystring. A flat URL looks like this: http://www.site.com/index.php/item/depar tment/Green/item/Jacket.html A normal URL would look like this: http://www.site.com/index.php?SCREEN=ite m&department=2&item=3 Notice that the script being used here is actually just http://www.site.com/index.php. Many search engines will not index dynamic-looking sites like the second URL. By passing in data using the method shown in the first (flat) URL, you can encourage search engines to spider the site. The network behavior section of the global file also
22
FEATURES allows you to specify the use of department names, instead of their IDs, in links (set ‘USE_NAMED_DEPARTMENTS’ to ‘TRUE’). The same is available for items (set ‘USE_NAMED_ITEMS’ to ‘TRUE’). Stores with these options turned on will also be better listed on search engines. The ‘DEFAULT_SCREEN’ constant defines the screen to be loaded if no screen has been requested. By default, it is the welcome screen (home page). ‘SEND_EMAIL’ specifies whether or not the system will send order confirmation emails. The ‘USE_SSL’ constant should be set to ‘TRUE’ if you want sensitive areas of the system to only be viewed using an SSL connection (ie. checkout process). This constant affects the ScreenURL() and StartForm() functions. FreeTrade doesn’t force users to have cookies turned on. Session variables will be tracked anyway, but if you want to use cookies, set the ‘USE_COOKIES’ constant to ‘TRUE’. The next section of the global configuration file is dedicated to the store catalog. Set ‘ITEMS_IN_MULTIPLE_DEPARTMENTS’ to ‘TRUE’ if you want some items to be shown in more than one department (very useful). The ‘ALLOW_DUPLICATE_DEPT_NAMES’ constant allows two departments to have the same name in the same parent department. I suggest that you turn this option off, especially if you have ‘USE_FLAT_URLS’ turned on. FreeTrade users, by default, can buy items whether they are registered or not. If you want to force users to be registered before checkout, set ‘SHOPPER_MUST_REGISTER’ to ‘TRUE’. FreeTrade supports coupons and gift certificates as well. If you want to allow them on your site, you should set the ‘USE_COUPONS’ and ‘USE_GIFTCERTIFICATES’ constants to ‘TRUE’. By default, coupons are turned off and gift certificates are turned on. The most important parameters in this file are related to the database. Indeed, it will be ‘challenging’ at best to run an online store without a successful connection to a database! You can use MySQL (set the ‘DATABASE’ constant to ‘mysql’) or PostgreSQL (set the ‘DATABASE’ constant to ‘pgsql’). You will need to set ‘DATABASE_HOST’, ‘DATABASE_USER’, ‘DATABASE_PASSWORD’, and ‘DATABASE_NAME’ to match your server settings. You may try the new FreeTrade caching option by turning the ‘USE_CACHE’ constant on. This option will force FreeTrade to cache the results of database queries. This feature is still in the testing phase, so don’t use it on live stores. Now you can play with your FreeTrade application. The default administrator login is a username of ‘admin’ and a password of ‘admin’. To test FreeTrade, you may use the FreeTrade Test Specifications that are located in the doc directory of your distribution package. It covers all of the store functions. March 2003 · PHP Architect · www.phparch.com
FreeTrade, e-commerce for developers Troubleshooting If the script fails to run properly and your configuration parameters are correct, there may be two possible causes. FreeTrade requires magic_quotes_gpc to be off. If magic quotes are on, FreeTrade will not function as expected. Throughout the site’s form handling code FreeTrade uses addslashes() when inserting or updating the database. It does not do anything special when setting the value of form variables, which may cause some headaches. You can only change the value of this setting in the system-wide php.ini file or in a .htaccess file in the site’s directory. Some installations may experience problems with file paths. FreeTrade automatically tries to find the right file paths in index.php in order to provide the capability to change hosts painlessly. If it fails, you can set it up manually. I’ve never had problems with these paths on Unix/Apache, but I have experienced it a few times on Windows. Setting it up manually means hardcoding the file paths in index.php. The constants you may have to hardcode are ‘SERVER_NAME’ (the name of the server, like ‘www.phparch.com’), ‘SCRIPT_NAME’ (the name of the store script, which is ‘index.php’ by default), ‘APPLICATION_ROOT’ (the path on the Web server’s filesystem where the modules directory is located, like ‘/www/’ or ‘/usr/local/apache/’), and ‘EXTERNAL_PATH’ (the path from the root of the Web site to the index script, such as ‘/’ or ‘/store/’). FreeTrade Workflow Many first-time users have problems with understanding how things work in FreeTrade, so I will explain the FreeTrade execution process in detail. When a page is requested, the index script first verifies that the PHP version is acceptable and that the configuration is correct. If everything is in order, it locates the modules directory. Next we include the global configuration file from the configuration module, as well as standard libraries from the utility and database modules. Now that everything is ready for initialization, so we include the initialization script from the utility module. This script will get the database initialization code and the caching code (if caching is turned on), and blank out the global variable $ActionResults. The initialization script handles cases where no session ID or department ID was provided, and sets them to defaults. Finally, this script detects the user’s browser, CSS support, Javascript settings, and anything else avaiable in the ‘HTTP_USER_AGENT’ variable. At this point, if everything is OK, we have finished preprocessing. Now we get into the more interesing part: executing actions. As stated previously, the
23
FEATURES
FreeTrade, e-commerce for developers
ACTION form variable, from the script which requested the page, names a file in the modules/action directory to be included. This script contains the action to take on submission of the form. For index.php, the action is like a black box. We include the requested action and receive the results (or error message) from it in $ActionResults. The action usually has several steps, and all of them must be successfully completed. FreeTrade uses an interesting technique to process these steps. Instead of using nested if’s and ending up with highly nested and hard-to-read code, it uses a nice feature of PHP4’s include() statement. In PHP4 a file that has been included can prematurely exit and return a value with a return() statement, like a function (this is valid for PHP4+ only; PHP3 doesn’t function like this). As actions are included files, we may use return() if any of the steps of an action fails. With this funcionality, we can organize the action as a set of steps in if blocks. If any of them fails, the others will be skipped. This improves readability and decreases the size of our code. Once an action has finished executing, we include the modules/configuration/ScreenInfo file and try to find the requested screen definition. If the screen is not found (or one has not been requested), we show the default welcome screen to the user. The default screen is defined in the modules/configuration/global file, as mentioned earlier (the ‘DEFAULT_SCREEN’ constant). The welcome screen is located in modules/screen, just like any other site screen. If we did find the screen definition, we read the layout and any other definitions for the requested screen. Now we have the action results, the layout, and the screen content. We just need to print it out and we’re done. That’s all - the index.php script is over! The workflow I’ve described is a general one. Each page is processed this way, whether it be user login or registration, adding an item to the shopping cart, or ordering.
There are a few things which may confuse you if you are new. Some of them are related to FreeTrade, while some are related to e-commerce programming in general. I will try to clear things up and save you some time. First, let me explain the FreeTrade nomenclature and how an online store basically works. FreeTrade calls your online shop a store. A store may have many different departments and subdepartments. For example, you might have two departments - one called ‘Books’, and another called ‘Music’. Each department can contain subdepartments and products. By default, there is a department called Root. All store departments are its subdepartments. You can place products in the Root department, too. A product is called an item in FreeTrade. A store user (a shopper) can add any item to their own shopping cart. FreeTrade uses the term basket when talking about the shopping cart. When a shopper is finished adding all of the items he plans to buy to his basket, he goes to the checkout page. On the checkout page, he is asked for his shipping and billing details. Finally, we save his order and eventually process his payment in real-time. Some other important terms are attributes and variations. I will try to explain them using an example. If you are planning to sell shirts in your store, then a shirt is an item. This item can be different colors and different sizes. Color and size are referred to as attributes, and you are free to specify as many attributes as you need. In our example, the shirt’s color variations may be red, blue, white and black. Size variations may be S, M, L, and XL. Every variation may have an extra price attached to it (an XL shirt may cost an extra $2). You are free to specify as many variations you need. You can manage attributes and variations in the Attributes and Variations administration menu. Now that we know what attributes and variations are, let’s go back to items again. If we want to sell our example item (a shirt with different colors and sizes), we must be able to give the selection of these attributes to the shopper. For some less powerful e-commerce scripts, the only way to offer a selection of attributes is to define a different item for each variation of each attribute. This means that we would create an item for a green and size S shirt, and another item for red and size XL shirt, and so on. We would need items for all combinations of sizes and colors. After that, just imagine how to change the price of that shirt! You would have to edit all items and change the price for every one of them. Obviously this is not a good approach. Let’s introduce a new, well known term in e-commerce: a SKU.
FreeTrade is a fully-featured and highly configurable PHP script for creating and maintaining e-commerce Web applications.
Real Life Tips FreeTrade has two sections. One section is for shoppers and one is for administrators. When logged in as an administrator, you will have an additional menu option (Admin Menu) in the left side navigation. This is a link to a set of protected pages for store administration. There you will find options for managing invoices, departments, products, gifts, coupons, taxes, and so on. March 2003 · PHP Architect · www.phparch.com
24
FEATURES SKU stands for Stock Keeping Unit and is an ID associated with a product for inventory purposes. Each SKU in FreeTrade has external SKU information to help connect a store’s offline business with its online business. FreeTrade uses this number to identify an individual product. A SKU is a subitem and it has its own name, price, weight, inventory stock, and any number of attributes and variations. An item has none of this. It has only a name, a description and images. All additional information for an item is stored in its SKUs. Each item has zero, one, or more SKUs. FreeTrade allows you to make items with no SKUs, but such items make no sense. As I said, you can select as many attributes and variations as you like for a SKU. In our shirt example, we can select all available colors and sizes for our shirt. If there is more than one variation of any attribute (for example, there are 4 colors), the shopper will see a selection box with all available colors for this item and he will be able to select the color he likes. For each attribute with more than one variation, FreeTrade will show a separate selection box. In our example, the shopper will see two selection boxes. One selection box will be for color selection and the other will be for size selection. You will probably have only one SKU per item, but sometimes it is good to have more than one SKU. For example, you may sell movies. You might have one SKU for the VHS version and another for the DVD version of the same item. Note that in this movie example you don’t actually need to make two SKUs, since you could define an attribute ‘Edition’ with two variations: ‘VHS’ and ‘DVD’. If, however, you need totally different pricing for both of these editions, you will have to define two SKUs. A SKU can have two prices; List Price and Sale Price. The Sale Price is the price the item sells for, and it’s the only price used for calculating order totals. If both prices are defined, a user will see both prices. You may leave List Price blank if you wish to show only sale prices. You can use the List Price to recalculate the prices for a whole department. If you implement a sale (Administrate Sales option), you can set the Sale Price to the List Price minus a defined discount. FreeTrade supports cross-selling. For example, if you are selling mobile phones, cross-selling allows you to offer headphones, hands-free kits, and cases for a phone model on that phone’s detail page. This can definitely increase sales. If you want to cross-sell (show similar items and/or accessories on the item page), you have to define a relationship (Administrate Item-to-Item Relationships option). Once a relationship has been defined, you’ll need to specify all of the items in the relationship (Administrate Departments and Items). In order to actually show related items to the shopper, you’ll need to edit the modules/screen/item file. On line 240, you will find: March 2003 · PHP Architect · www.phparch.com
‘Cross-Sell’ and ‘Accessory’ are default relationships. To show any other relationship, you will need to add a line for each new relationship you’ve created. The showRelationship() function is defined in the same file, so you are able to modify it to show related items in the way you would like. The default way to show related items is just a bulleted list of linked item’s names. Items may have up to 3 images: Thumbnail, Graphic and LargeGraphic. The Thumbnail image is used in the department listing. The Graphic image is used on an item’s detail page. The LargeGraphic image is used in a pop-up window opened by clicking the Graphic image on the item detail page. None of these images are required, so you can have items without having images. FreeTrade does not support image uploads in this version, which may be a problem for store owners if they are not familiar with FTP. Of course, the beauty of open source is that you can always just add the upload option. This would require changes in a few of the scripts for item administration. The checkout process takes a little getting used to at first, but I believe that this will be improved in future versions. As soon as the user steps into the checkout procedure, all of his shopping cart content is moved (cart becomes empty!) to a temporary invoice. This may be a problem if a user decides to exit from the checkout procedure to buy something else. Although no data will be lost, the shopping cart will show only items added after he exited from the checkout. The checkout process will collect all of the items and the user will still be able to order all selected items, but it can be really confusing. A user expects all items to remain in his shopping cart until the checkout process finishes. Aside from this, the checkout procedure is quite smooth. A shopper with items in his basket can step into the checkout page either from the basket content page or from the menu. FreeTrade shows the complete content of his basket, along with a form for the shopper’s address. If the shopper is already registered and logged in, his address will be auto-filled. Form submission will transfer him to a billing page where he enters his billing information. FreeTrade will check the credit card information that a shopper enters here before transferring him to the last step: the confirmation page. This verification is handled by the modules/utility/validateCard script. Note that this verfication only checks that the credit card number is valid; no charges occur. If the card is
25
FEATURES
FreeTrade, e-commerce for developers
acceptable and the shopper confirms the order, FreeTrade will save it into the database. Extending FreeTrade FreeTrade is meant for developers. It is very extensible and flexible. The following is a practical example of how easy it is to extend the FreeTrade system to meet your specific needs. By default, FreeTrade stores credit card information in the database table invoice_billing for each individual order. This is a problem, in my opinion, even when using dedicated servers. Redundant data in a database is generally never good news, if not for reasons of efficiency, than for the problems data redundancy can cause in the area of maintaining data integrity. There are a few ways to avoid this problem. The best way is to use a third-party real-time credit card processor and not collect credit card numbers at all. If you are not Listing 2 /* ** Function: cleanInvoiceCreditCard() ** Input: INTEGER invoice, REFERENCE error ** Output: ** Description: removes credit card info from ** invoice_billing */ function cleanInvoiceCreditCard($invoice, &$error) { global $DatabaseLink; $invoice = (integer)$invoice; $Query = "UPDATE invoice_billing SET "; $Query .= "CreditCardNumber = \"[REMOVED]\" "; $Query .= "WHERE Invoice = $invoice"; if(!($DatabaseResult = mysql_query($Query, $DatabaseLink))) { $error = mysql_errno() . ": " . mysql_error(); return(FALSE); } return(TRUE);
able to use a third-party credit card processor, you could store the credit card information in the database for a day or two until you process the order. At that point, you can mask all but the last 3 or 4 of the credit card number digits, or you could just delete the credit card information. I will give an example of how to make this work using MySQL. First, open the modules/database/mysql/invoice file and add the function in Listing 2. Now make a new file, modules/action/CLEAN_INVOICE_CC, containing the code from Listing 3. Finally, add a button for credit card cleaning in the modules/screen/edit_invoice file with the following: print(StartForm( "edit_invoice", 'post', 'CLEAN_INVOICE_CC' , FALSE, array("invoice_ID"=> $_REQUEST['invoice_ID']) ) ); printSubmit(localize('Clean Credit Card' )); endForm();
This is a good example of a simple extension to the original FreeTrade behaviour. The store administrator will now be able to remove the credit card number as soon as it is no longer needed. By default, FreeTrade has no module for credit card processing. FreeTrade is a developer product, and the authors assume that every developer will use the method provided by the individual processor, as the parameters may vary from processor to processor. This approach is fine for experienced developers, but beginners may have problems, especially if the credit card processor has poor documentation. In order to add credit card processing, you’ll need to obtain the
}
Listing 3 1
March 2003 · PHP Architect · www.phparch.com
26
FEATURES
FreeTrade, e-commerce for developers
processor’s documentation and change modules/action/SUBMIT_ORDER file.
the
tion. If you want a well-writen script, with lot of features and customization possibilities, give it a try.
Wrapping Up There are a few things I would like to see included in FreeTrade’s future versions; for instance, better documentation, modules for credit card processing, better search functions, better reports, bulk import solution, and an image upload function. However, I do find FreeTrade to be a usable product for developers in its current state. FreeTrade is a great Open Source product, but it’s not for everyone. It is a framework, not a plug-and-play solu-
A few words with Leon Atkinson Leon Atkinson (http://www.leonatkinson.com), author of FreeTrade have been a big fan of PHP and MySQL. What he is best known for is the book Core PHP Programming, first published in 1999. It was the first book in English about PHP. In 2001, he wrote Core MySQL book too. We asked him few questions about FreeTrade. What are your plans about FT in next period, after FT3 final release? Customers inspire new features in FT. I’m an independent consultant working with several Web development businesses. My partners bring project to me. I use what I have and donate new code back to FT when appropriate. So, in a certain sense, new features are out there, beyond the horizon. (Low visibility as people like to say regarding the economy). There are a few ideas floating around, waiting to be implemented. Sean Farley, a frequent contributor, is working on implementing an affiliate program. That is, a system that allows the site operator to track sales referred from other sites. Amazon.com, being the pinnacle of online catalogs, inspires new functionality. In fact, customers often make remarks like, “I want the checkout process to be like Amazon.com”. However, it’s important to keep in mind that while Amazon.com is a great example of a mega-store, benefiting from millions of dollars of development, FT is suited for smaller online stores. For the past several months I’ve been working on the third edition of Core PHP Programming. All the new features appearing in PHP inspires me to work them into FT. The database interface is always an interesting topic in opensource projects. Being flexible means more people can use your project, perhaps contributing. On the other hand, I dislike sacraficing performance to a lowest-common-denominator approach. That’s why we have separate database modules. PostgreSQL support has been complete for some time now. Lately I’ve been considering coding a module using the dbx_* functions. I saw your A.G. Ferrari (http://www.agferrari.com) site, it’s great. Do you have some other examples of good FT sites? Thanks. I have to give due respect to Clear Ink (www.clearink.com) who ran the project and Smashing
March 2003 · PHP Architect · www.phparch.com
About The Author
?>
Vladan Zirojevic is a Serbian web developer, educated in Computer Science. He worked on more than 30 various PHP projects both as project leader and part of the team. Some of them www.24sata.com, www.poljubac.com, www.trebinje.com, www.mobilnimagazin.com... Currently, he is employed as a Senior Web Developer in SEENETIX, Belgrade. He can be reached at: [email protected]
Click HERE To Discuss This Article https://www.phparch.com/discuss/viewforum.php?f=7
Pixels (www.smashingpixels.com) who did the graphical design work. A great example no longer viewable is Restoration Hardware. This site grew so successful, it outgrew FT. It’s the nature of FT that once built, a site slowly drifts away from the original codebase. Where as some software projects attempt to be a closed box with lots of knobs for customization, FT attempts to be the engine you place in your own box. A FT site is craftwork rather than something that comes off an assembly line. Another early site that demonstrates this drift is www.dantz.com, which still uses FT at its heart. Because anyone can grab the source code and put up a site without any help, sites appear without me knowing. I recently discovered Thrasher Magazine runs a version of FT (http:// www.thrashermagazine.com). Sometimes people post links to their own projects to the mailing list, such as VentureOut in New Zealand: http://www.ventureout.co.nz/. Do you have any idea about how many FT e-commerce sites exists? I don’t have any hard numbers. I suspect it’s something like 100. In your oppinion, what is most often problem for novices in FT, regarding e-commerce site development? Most questions we get are about setup. It’s assumed you’ll find the configuration/global file and read the comments, but some people don’t. Then there are issues with handling credit card numbers, particularly with hosting sites on shared servers. A successful online store deserves a dedicated server with most of its ports locked down. I find it hard to accept that a site that can’t cover dedicated hosting costs needs to process credit cards, but there are other methods. I try to encourge these lowerend sites to go with another payment solution, such as PayPal. Do you have any particular suggestion for PHP developers planning to use FT? Participate in the mailing list. There are several people who have used FT for quite awhile and will answer questions quickly.
27
Use PHP? Love PHP?
Live PHP!
PHP is an Open Source
(for three days)
scripting language with serious technical muscle. No wonder it’s the language of choice for Yahoo, Inc. and over 8 million domains worldwide. Whether you’re a PHP pro or completely new to the language, PHPCon East 2003 will take you to the next level.
Meet and mingle with the experts during two conference days PHPCon East 2003 Speakers include: • Rasmus Lerdorf, Opening Keynote Speaker and Inventor of PHP • Zeev Suraski, Closing Keynote Speaker, CTO of Zend • Zak Greant, MySQL.com • Shane Caraveo, ActiveState, Inc. • Luke Welling and Laura Thompson, Tangent Technologies • George Schlossnagle, OmniTI
Technical learning at all levels PHPCon has a full day of hands-on, technical tutorials that offer something for everyone, including: • Beginning PHP • Developing MySQL Applications with PHP • Performance Tuning PHP
CHECK OUT THE PROGRAM & REGISTER TODAY!
• XML with PHP
http://www.php-con.com/pa
• PHP Web Services with SOAP
PHPCon East 2003 Park Central New York Hotel
EAST2003
New York City, New York April 23 - 25, 2003 Tutorial and conference sessions are subject to change.
REVIEWS
phpLens
REVIEW
Published by Natsoft
Create, Paginate, Edit, and Search.
A
nd now for the $64,000 question (someone should really update these maxims—the money is not that impressive anymore!): What happens when you introduce your PHP script to a database? If you’re lucky and have plenty of time (and money) on your hands, the answer could be a well-built, dynamic website. If time comes at a premium, however, you might end up with something that’s all too common in the IT world: an unmeetable deadline and, even worse, an unhappy client. This is where software like phpLens comes into play. The phpLens system is a complex, but not complicated, application server that can be used to build dynamic, data-driven websites. Heard that one before? Hang on to your hats. phpLens is a completely integrated, object-oriented platform. It includes the ability to edit and manage your HTML templates (Smarty, or otherwise) directly from a web browser, which makes creating different views of your data easy and convenient. Figure 1 shows one of the examples provided on the phpLens website - there are almost thirty examples there - where a complete calendar template has been created and populated using a simple Smarty template and no more than twenty lines of code. Incredibly, the same system can produce something as divergent as a bar graph, shown in Figure 2, while still using fewer than thirty lines of code.
March 2003 · PHP Architect · www.phparch.com
The Cost: $120 - $3,600 (US)
Requirements: PHP 4.06 or later Zend Optimizer MySQL or Oracle PHP4 sessions must be enabled
Database Accessibility The phpLens engine is written entirely in PHP and is based on the well known ADOdb database abstraction library developed by John Lim (who, incidentally, is also the technical lead of the phpLens team). Anyone who has ever used ADOdb knows how seamlessly it works across a multitude of database systems, including MySQL, Microsoft SQL Server and Oracle, to name a few. Perhaps one of the most impressive features of this software is its data manipulation functionality. Editing datasets is a breeze in both the “standard” and “Hot
29
REVIEWS
phpLens Review
editing” modes. The “standard” mode refers to editing records one-per-page. “Hot editing” refers to a method where multiple records can be modified at the same time using an interface similar to that of a typical electronic spreadsheet. The data editing engine automatically recognizes the data type of each field, including enumerative types (managed through drop-down lists). It’s also possible to specify very complex relationships between tables. Security Features phpLens offers both simple and sophisticated security features. Some of the simple features include the ability to password-protect the dynamic editing functions, and to automatically filter user input for potential dangerous HTML and SQL statements. One of the more sophisticated features is the checksumming of records to ensure that the data is not transformed in-transit by a third party. Another is the fingerprinting of data, using the md5 digest algorithm, in order to prevent a malicious user from creating fictitious records without the proper authorization. It should be noted that phpLens does not implement
a full authentication system. This is left to the developer to implement using their own choice of tools, such as Smarty. Finally, the phpLens documentation includes a stepby-step procedure for securing the folder in which phpLens resides, without impeding its functionality. Documentation and Support phpLens comes with an excellent set of user documentation. By “user” I actually mean “developer”, since phpLens doesn’t force its adopter to utilize a particular end-user interface. The entire documentation set is available online. It follows a very natural path, from installation all the way down to the smallest security details. In addition, the phpLens website includes a complete reference of all the objects, properties and methods used by the package. Although support for the product is offered on a commercial basis, the phpLens website includes an astounding number of examples. Also available are a FAQ area and a very active peer-support forum (all developed using phpLens, of course).
Figure 1
March 2003 · PHP Architect · www.phparch.com
30
REVIEWS
phpLens Review
Packaging and Limitations phpLens comes in three flavours: Basic, Advanced and Enterprise. The Basic version supports all the data manipulation and presentation layer features, with the exception of creating and editing records. The Advanced and Enterprise versions, on the other hand, support all of the features. The only difference between the two is the number of database systems they each support. Because it is based on pure PHP code, phpLens runs on pretty much any platform PHP runs on. The only requirement is the presence of the Zend Optimizer, as all of the phpLens scripts and include files are encrypted using the Zend Encoder. This means that you won’t be able to view and modify phpLens’ source code, although this shouldn’t be much of a problem thanks
to its object-oriented design. Incidentally, if you’re concerned that it might be difficult to find a hosting provider that uses the Zend Optimizer, the phpLens website includes a list of ISPs who do. The Bottom Line phpLens is not an expensive product. Considering the standard set by other similar applications, it offers an excellent set of functionality for a very reasonable price. If you need to set up a complex, data-driven website and you’re in the game for the long term, phpLens is worth serious consideration. It’s object-oriented structure is a good bet for the future, and will easily fit into the PHP 5 philosophy once the new version of the PHP platform becomes available. php|a
Figure 2
March 2003 · PHP Architect · www.phparch.com
31
FEATURES
FEATURES
Blazing Site Performance Using Objects and Sessions By Peter Moulding
Object Oriented Programming (OOP) is in! PHP 4 works with OOP, and PHP 5 will make it perfect. Likewise, PHP makes using sessions easy. Sessions pass state information from one Web page to another. You can place almost anything in a session, including objects. By marrying the two and storing objects in your sessions, you can take advantage of ‘run once, use everywhere’ processing.
I
n this article I’ll look at ways to match the best features of objects with the best features of sessions, to build fast and efficient dynamic Web sites. I’ll talk at great length about performance considerations, and look at ways to keep your session data small, and your site’s performance blazing. All of this theory will be supplemented with a wealth of practical examples (and some neat little tricks) which I hope you’ll find useful. Objects are based on classes, and after looking at the programming books in the local bookshop, it seems that there are around 287 ways to write classes. In this article, I will skip the style information and go straight to the core structural issues relevant to building your classes. The information provided is useful for writing all classes, whether used in sessions or not. The code included in this article has been tested with Web sites using PHP 4.2.x and 4.3.0. Some examples are stripped-down classes representing techniques used in large-scale commercial sites. Others will show you what does not work. The examples are tested using sessions stored in files, as well as MySQL and Oracle databases. You might decide to change your session configuration, or switch databases, after reading this article. Session Performance One of the first questions you might ask is, “How do sessions perform when filled with objects?” When you
March 2003 · PHP Architect · www.phparch.com
use sessions, PHP adds a file or database read at the start of your script, and a file or database write at the end of your script. The read should be instantaneous because the data should be cached in memory from the previous script execution. The write will slow down your script, but it usually occurs after you send your last chunk of data to the visitor’s browser, so the visitor should never be impacted by the write time. Oracle, MySQL or Files, Oh My! Session records can be stored in files or databases. File access is fast on most systems. Microsoft NTFS writes files in 4KB chunks, while Linux ext2 uses 8KB chunks. In effect, when you use files there is little difference between writing a short session record containing only a 32 bit session ID versus writing a 2KB session record containing a large object. If your files are on a RAID device, the RAID device might be writing 64KB chunks. This means you can ‘go crazy’ with storing big objects in the session, without worrying too much about the consequences. REQUIREMENTS PHP Version: 4.3 or Above O/S: Any Additional Software: N/A
32
FEATURES Databases store rows in pages. Pages are chunks of disk, usually 4KB long. If your session data is 50 bytes long, a page stores 80 rows. If your object blows the session data out to 2KB, a page stores just 2 rows. Databases are more space efficient than files. However, from a performance perspective, a page write is the same as a file block write, so a database should not slow down your sessions. MySQL is slower than files on some systems and a little faster than files on other systems, but generally there should be almost no measurable difference between a 50 byte session row and a 2KB session row. Furthermore, the time difference is linear, which means that adding an extra object produces little difference. Although MySQL lets you choose from three different table types, the default native table is the best choice as it has no transaction processing overhead. Oracle is different. Oracle has 4KB text fields, so anything larger has to go into a CLOB (Character Large Object Binary). Tuning Oracle for optimum performance is painful, and tuning the use of CLOBs is even more difficult. Oracle sites report overheads of 50% or more, unless you use techniques specific to Oracle. To me it seems silly to use database abstraction classes to drive Oracle PL/SQL, or anything else that can work only on Oracle. Oracle is great for transaction-based processing, but sessions do not need transactions! If you have an Oracle expert on board, you can place your session records in Oracle. Otherwise, leave your sessions in files. You may also consider leaving your sessions in files if your database is on a different box from your Web server and you can’t use a local database for sessions. Whatever you do, do not place your sessions in a database on another machine, or even on a disk shared across the network, as the added network and system complexities can increase overhead by 400% or more. Optimizing Code Structure for Sessions When you place an object into a session, only the data from the object gets stored in the session. When the session data is written out, the object is passed through serialize(), which squashes the object variables into a single string. Code does not take up space in the session, so why would we want to focus on keeping the code small? Basically, we want small code because it has to be compiled at some point. On many web sites, code is compiled every time a script starts. Imagine a class with several thousand lines of code being compiled for every web page. An optimizer will cache the compiled code, but not every ISP uses an optimizer, and optimizers cannot cache all scripts. When we store an object in a session, the class code will still need to be compiled again on each page when March 2003 · PHP Architect · www.phparch.com
Blazing Site Performance Using Objects and Sessions the object is deserialized out of the session. If we can minimize the amount of code that needs to be loaded and compiled in order for an object to be deserialized, we gain efficiency. To this end, I prefer to split code into a structure based on usage. One block is the code used at the start, then a block used in the middle and, finally, a block used at the end. It is a bit like splitting a web page into header, body and footer, or a file access operation into open, read and close. This approach is a good first step towards cutting large blocks of code into small, manageable chunks, and has the added benefit of building memory efficient objects. When thinking about storing objects in sessions, memory usage efficiency also means session usage efficiency. PHP classes offer a class constructor to perform the initial work to be performed by the class. Unfortunately, the constructor code remains with the class throughout its life. If our constructor code is huge, we can consider putting the bulk of the constructor code into a separate class. In the upcoming example classes, the class objective_foot has all of the construction code in the constructor. objective_foot2 is a reconstructed version of objective_foot where the constructor contains just a single include of objective_foot_start. The obective_foot_start class includes all of the components needed to build the data for the object. The idea here is that the ‘expensive’ part of the class, the constructor, is only needed once (on the first page of the site). Why carry it around with us when all we want is the data? Splitting the code into separate “use once” and “use on every page” classes will make both compilation and caching more effective. Place all of your difficult-to-write code into the “use once” class. Keep the “use on every page” class simple, easy to understand, and easy to maintain. Once your “use everywhere” class actually is used everywhere, making changes will be difficult. You will reduce maintenance problems by organizing your code based on usage. There is another objective for splitting your code: providing a layer of abstraction to afford yourself the flexibility of handling future changes in the logic section of your code. The “use on every page” class should contain just data. File processing and other specialized logic code should be in another class so that it can be easily altered. Next week you might have to replace file processing with databases, or text input might become XML. If you can split your code along these ‘logic vs. data’ lines, you gain a level of abstraction where the users of the “use on every page” code work only with data, and are not affected by changes to the logic.
33
FEATURES
Blazing Site Performance Using Objects and Sessions
An Example or Two
The Example Classes
I would like to show you a full content management system, but the code listing would flood this magazine and squeeze out the interesting stuff! The following examples represent a real set of code that reads XML and database sources to build page content, and are minimal examples I use when teaching PHP. In actual web sites, the code builds input from many sources, and the initial class constructor grows into multiple includes, wrapped in miles of logic. Individual inclusions can have thousands of lines of code which find, read and decode input. For this discussion, all we need are a few inclusions with a sample line of code in each. Let’s pretend we have the task of creating a common footer for all the web pages at petermoulding.com and we are told to use existing formatting classes. This footer is a copyright notice centered on a blue background. You’ll notice that there are a number of small classes in this example and you might wonder why. We want an example where there is a big chunk of code used once, and one small object which is used repeatedly. In this example, all of the classes are used when the visitor requests the first page from the site. For every other page request, only the class containing the formatted HTML is needed.
objective_html is the base class for all of our other HTML classes. It provides a common $html variable and the get_html() method. The class is stored in a file named objective_html.class in the site’s include directory and is shown in Listing 1. The next few classes are derived from the objective_html class and are shown in Listing 2 through 6. The objective_foot class includes all the previous classes in one big constructor and is shown in Listing 7. The Result
Listing 1: objective_html class 1 html = $html; 11 } 12 13 function get_html() 14 { 15 return($this->html); 16 } 17 } 18 19 ?>
Listing 8 ties it all together. The image in Figure 1 shows the fruits of our labor. This might seem like a lot of work for this small result, but remember that the examples are representative of code delivering a whole Listing 2: objective_font class 1 html ='' 9 . $text . ''; 10 } 11 } 12 13 ?>
Listing 3: objective_link class 1 html = '' 9 . $text . ''; 10 } 11 } 12 13 ?>
Listing 4: objective_td class 1 html = '
' .$text . '
'; 10 } 11 } 12 13 ?>
March 2003 · PHP Architect · www.phparch.com
34
FEATURES
Blazing Site Performance Using Objects and Sessions
web page constructed from 30 or more sources. Now let’s see what the code looks like when we apply the optimizations we talked about earlier. The objective_foot_start class, shown in Listing 9, contains all of the constructor code that can been removed from the objective_foot class. If any of the code in this class prevents your optimizer from caching the compiled code, you will lose caching on only one page (the first one). Listing 10 shows the objective_foot class with all the constructor code replaced by one require_once(), including the objective_foot_start class. When this class is compiled for your second and subsequent pages, the compilation time is reduced and there is more chance your optimizer can cache this simple class. Now, let’s see how to put these objects into the session.
Getting Objects into the Session When you manually start the session in your code with session_start(), placing objects in the session is easy. On your first page, you can use the following code to create an objective_foot object in the session. require_once('objective_foot.class'); session_start(); $_SESSION['foot'] = new objective_foot(); print $_SESSION['foot']->get_html();
On the first page, the require_once() and the session_start() can go in any order. At the end of the first page, PHP will run $_SESSION[‘foot’] through serialize() to produce a string for storage in the session record. Life is different on the second and subsequent pages.
Blazing Site Performance Using Objects and Sessions
The session_start() on the second page will grab the string from the session record and run the string through unserialize() to recreate the object in the $_SESSION[‘foot’] variable.. Remember that objects require code, and code is not stored in serialized objects. The unserialize() function has to read the object’s class code and add the code back to the object, which means that you have to be sure to include the class before using session_start(). This works: require_once('objective_foot.class'); session_start(); print $_SESSION['foot']->get_html();
but this does not work: session_start(); require_once('objective_foot.class'); print $_SESSION['foot']->get_html();
The next question is what to do when your ISP sets PHP to automatically start sessions and will not change the setting for you. You should consider changing your ISP. :-) There is something less dramatic than changing your ISP, and it is something you might want to do anyway to lower your overhead. I mentioned before that you might want to restructure in order to minimize the code you include on every page. You can go one step Listing 8: Example usage of objective_foot class 1 2 3 4 5 6 7 8
get_html(); ?>
further to solve the problem of including unnecessary classes on every page. Close your eyes and visualize a complete, perfect content management system written to do absolutely everything you want (and everything you will ever want). The code is optimized to include the bare minimum number of classes on every page. The trouble is that the bare minimum includes hundreds of classes, with most classes used only on a few pages. In front of your session_start() are 200 lines of require_once(). This would normally be quite a problem with automatically started sessions. Here is a way to prevent including every class on every page which also solves the problem of session_start() happening before your code gets a chance to include any classes. On your first page, write: require_once('objective_foot.class'); $foot = new objective_foot(); $_SESSION['foot'] = serialize($foot);
(Note that if you want to use this method and you don’t have sessions automatically starting, you will need a session_start() call at the beginning.) Listing 10: objective_foot2 class 1 get_html()); 12 } 13 } 14 15 ?>
FEATURES On your second page, write: require_once('objective_foot.class'); $foot = unserialize($_SESSION['foot']);
(Once again, if you don’t have sessions automatically starting, you will need a session_start() call after the require_once().) All you are doing is replacing manual sessions and automatic serialization with automatic sessions and manual serialization. Because you control the unserialization, you can include the class before unserialize() . You also get to choose which objects are unserialized for a given page. You include only the classes you need and unserialize only the related objects. Do not worry about the objects you do not unserialize on a specific page because they will float from page to page in the session record for as long as you want. When you are finished with an object in the session, simply unset it like so: unset($_SESSION['foot']);
There are a few benefits and a couple of traps to watch out for when performing your own serialization. One trap is deciding when to create the object. There is no reason to create an object before you have data for the object. If you do not create all your objects on the first page, test subsequent pages for whether the object exists before using unserialize(), like so: If (isset($_SESSION['foot'])) { $foot = unserialize($_SESSION['foot']); }
Another trap is letting PHP automatically convert $_SESSION[‘foot’] to $foot through the register_globals feature. Make sure your php.ini sets the register_globals directive to ‘off’. If your ISP will not turn off register_globals, change $_SESSION[‘foot’] to $_SESSION[‘foot_string’] so that register_globals will not set your code off on the wrong ‘$foot’. Selective Object Usage When you control unserialize you also open another option. Pretend our content management system has to read articles dated 2001 from text files, articles dated 2002 from XML files, and later articles from XML in a database. You could have three versions of an object in the session. For instance, if our example objective_foot class came in three versions, you March 2003 · PHP Architect · www.phparch.com
Blazing Site Performance Using Objects and Sessions could populate your session with variables like: $_SESSION['foot']['database'] $_SESSION['foot']['text'] $_SESSION['foot']['xml']
If $x = ‘xml’, the following code would bring in the correct object for the current page (assuming that the classes are called objective_foot_database, objective_foot_text, and objective_foot_xml). require_once('objective_foot_'. $x .'.class'); $foot = unserialize($_SESSION['foot'][$x]);
Data, Not Code Some people push Object Oriented Programming as a way of storing data together with the processing required for the data. There are times when it is better to keep all data out of your class (and in some cases keep all code out of classes). The best practice is to separate the data class from the access class. The data object in the session stores the minimum identifiers required to access the data, and the access method is in a separate class. This leads to the well-proven practice of separating code from data. Think of a simple class to perform calculations using Pi, which has no exact value. Around 150 BC, Ptolemy said Pi had the value 3.1416, but by 1600 Van Ceulen was up 35 decimal places after the 3. When you create a class to calculate the area of circles and the like, you want to keep the value of Pi out of your class so that your class is not restricted to the accuracy of the value you choose. The person using your class can then add their own value to give them the order of accuracy that they want. You might then extend the base, “code only” class with classes adding nothing more than the value of Pi to various degrees of accuracy. By 1873, Pi was up to 527 decimal places (actually, it was 707 decimal places, of which 527 were correct). 527 places is beyond PHP’s integer mathematics, and so you’d need to switch to bcmath - http://www.php.net/manual/en/ref.bc.php. The approach of separating data from code, and then adding the data to the code by extending the code class, does not fit our design targets for objects stored in sessions, because you end up with all the code included in the data object. How do we marry the two approaches? The answer is to make the code class use the data class by reference. The tiny data class produces a tiny data object to sit in the session. The code class then uses the data class when needed. If you have Pi in class pi, and your calculations in class calc, your initial page will set Pi using the following code.
37
FEATURES
require_once('pi.class'); $_SESSION['pi'] = new pi();
The pages performing calculations can pass the pi object into calc via calc’s constructor. require_once('calc.class'); require_once('pi.class'); $calc = new calc($_SESSION['pi']);
This approach fits a site with many discrete values shared over all pages. The value of this approach increases if each data item requires complex and unique code to access the data. On a commercial site, you might use this approach when you have to retrieve a currency conversion rate from one source, and a commodity price from another site. You might want to get the current rate when a person logs on, but then keep the rate constant during several pages of calculations, a task ideal for objects in sessions. Now that we’ve seen how to effectively store and use data objects in the session, let’s look at another example. Abstraction or Distraction. ADOdb and other PHP software products aim to give you a level of abstraction between your code and the rest of the world. ADOdb’s aim is to free you from writing code specific to a database. Most abstractions struggle with configuration issues. In the database area, MySQL and PostgreSQL use standard dates, while Oracle uses something different. To deal with this, you are faced with a few options. You could write PHP code to format your dates for a specific database, use ADOdb’s special date functions, or use a configuration option to make Oracle use a modern date format. How can objects stored in the session help? The date format string in ADOdb is a prime target for a small, frequently used data object that you can set once and then load into a session. Your database name and other configuration details could also go in the same object. The advantage increases when individual users have separate requirements. As soon as your web site goes international you will get visitors using country specific formats for dates, times, currencies and postal codes. All of these items fit the object-in-session ideal of small size, frequent use, and specificity to each user. Some data items, however, are just distractions. A visitor’s country code will be useless if your site has no country-specific code. In order to use these little data objects, look for items that are used after the user logs on, and items shared among software. If the database name is only ever used by ADOdb, let ADOdb handle
March 2003 · PHP Architect · www.phparch.com
Blazing Site Performance Using Objects and Sessions the name. If and when you feed the name to a second software application, move the name out of both sets of code into your special session object. When Size is Important After reading this article, you are ready to attack your code and push all your objects into sessions, but remember the 4 or 8KB disk block size I mentioned earlier. Use that size as an arbitrary limit to curb your enthusiasm. I sometimes put a 10 MB string into a session to pass the string from one page to another, but I do not leave the 10 MB in the session, since that would slow down every page. Here are some guidelines you can start with to loosely govern the space consumed by objects in sessions so you can keep your sessions light. Place the following code in a test page.
Now run the page and look at the session file. The code adds the following 13 byte string to your session. x|s:6:”string”; You can see that one byte is used for the variable name. Clearly you do not want to give your frequently session-stored variables long names. Place the same variable into a class and look at the result. Listing 11 shows a test class and the object creation. Below you can see the 46 byte string result from the sessions file. The class name and the variable name take up space, so keep both names short. test|O:10:”test_class”:1:{s:1:”x”;s:4:”d ata”;} You can make your code as elaborate, or weird, as you like because the code does not take up space in the session. Listing 12 shows a class with an unnecessarily long bit of code in the middle. The class is followed by the object creation. The code produced the following string in the session file. Note that none of the literal values or work fields use space in the session. t|O:1:”t”:1:{s:1:”x”;s:4:”aaaa”;} OK, You Can Add Code The previous examples feature mainly data in the objects stored in the session. What can you do if you have really slow code access, do not have a code opti-
38
FEATURES mizer, or just want to show off to friends? You can place your code in the session! This is a class act that could save you access time if you have to retrieve code from a remote network. Put the following statement in a test page. $_SESSION['c_class'] = " class class_in_session { function class_in_session() { print('class_in_session: it works!'); } }";
Now you have a class’s code stored in your session. If that class just came across a slow link from your server in Ivigtut, Greenland, you really want to save rereading the code for every page. While this example sounds stupid now, the move to web services means we are using code on a remote server that may in turn use a remote server serving from an even more remote server. You may never know how many servers are in the chain between you and the real data source. So, how do you use the class locally? Try the next line. eval($_SESSION['c_class']);
eval() evaluates the string as PHP, exactly the same as if you had saved the string in a local file, then included the string via include(). Now you can place your object in the session using the following code. $_SESSION['c_object'] = new class_in_session();
On subsequent pages you must create the class before you access the object, so use the following code before you access the object. eval($_SESSION['c_class']);
Listing 11: Simple object serialization example 1
March 2003 · PHP Architect · www.phparch.com
Blazing Site Performance Using Objects and Sessions
Note that this will only work when you control the unserialization. If you can access the $_SESSION[‘c_class’] variable, the session has already been started and the object has been unserialized. Unfortunately the object would not have been properly unserialized because its class was not yet declared (a quick print_r() on $_SESSION will confirm this). It’s a little like the chicken coming before the egg. If you control unserialization, however, you can declare the class with eval() and then unserialize the object.
One of the first questions you might ask is, How do sessions perform when filled with objects? The File Open Blues Storing class code in sessions reminds me of another way to save time. The overheads mentioned in this section are true of almost every operating system ever built. I’ll mention a script including 100 classes, but if that seems too many for you, prepare to be shocked by Small Incremental Class programming. The fully normalized code for a content management system would use more than 1,000 SIC classes, with most included on every page. You are about to suffer the File Open Blues. File access takes O + (N * R) seconds where O is some tiny amount of time to open the file, N is the number of file blocks read, and R is the block read time. Reads from RAID are faster than normal disk, and reads from cache beat RAID. Whatever you use for your files, one Listing 12: More complex object serialization example 1 x = $z; 16 } 17 18 } 19 20 $_SESSION['t'] = new t(); 21 22 ?>
39
FEATURES
Blazing Site Performance Using Objects and Sessions
thing is constant: O is greater than R. Opening a file costs more than reading a block because the operating system has to find the file, check access privileges and sometimes log the open. File opens are the curse of object oriented programming. If you did not have file opens you could place every class in a unique file of the same name. Your objective_foot class would go in objective_foot.class. PHP could have a php.ini setting that says to include class foot from directory /classes/, another to tell PHP you are using the extension .class, and another to tell PHP that the new operator should automatically include the class. The code: require_once('objective_foot.class'); $foot = new objective_foot();
would be reduced to: $foot = new objective_foot();
Let’s jump back to reality. If you take 20 classes, place them in separate files, then run a script requiring all 20 classes, your script will run like a dog with a broken leg. Try the same script including 100 classes, and no matter how small the classes, the script will run like a dog with 3 legs broken. Now copy all the classes into one big include file and test again. Pow! You are back to the speed of light. Each file open costs you several times more overhead than each read. That is why just about every language allowing modular code also provides a way to include collections. Like a collection, your session record is a great place to store your 100 tiny classes without the “file open” overhead. Your session will always be opened, so you are already paying that price. Besides, the cost of opening the session is minimal because sessions are usually set up with minimum access controls and are among
the first files to have their directory cached. You can place all your classes in the session without even the overhead of opening a class collection file. Some ISPs have fast session file access but pathetically slow access to libraries and databases. You can get your classes and objects to the front of the access queue by placing both in the session record. Of course, a better approach is to change your ISP. The best approach is to try this technique first, list your coding accomplishment on your resume, then change your ISP. Conclusion Session records have a long history dating back to database-driven online systems used in the late 70s. Back then, the equivalent to Object Oriented Programming was named modular programming. Programmers experimented with storing modules with data in session records. They proved that the technique was useful, but not practical with the programming languages then in use. PHP makes the technique so simple that even I can do it! Storing objects in sessions is a simple, proven technique for squeezing maximum performance from small amounts of frequently-read user-related information.
About The Author
?>
Peter Moulding started building computer systems about the time when people realised mainframes could be used as online information systems instead of just number crunching. He helped large companies discover GML (The grandparent of HTML and XML), SQL, develop user interfaces, and service level agreements. Then he jumped ship to use Apache and PHP on a PC because "I like replacing million dollar mainframes with $10,000 PCs". Peter experiments with PHP and MySQL at petermoulding.com.
Click HERE To Discuss This Article https://www.phparch.com/discuss/viewforum.php?f=8
Dynamic Web Pages www.dynamicwebpages.de sex could not be better | dynamic web pages - german php.node
Writing an RSS Aggregator With PHP By Marco Tabini
Tired of scouring the web for your news? Want to keep abreast of your favorite blogs without using software that doesn’t work the way you want and costs you money? A few lines of PHP could be the solution you’ve been looking for.
Weblogs (or blogs, as they are commonly called) have rapidly become commonplace on the Internet. Even though they started as little more than online journals and diaries, they have rapidly become everyone’s news and editorial outlets. Many people have their own “favorite” blogs that they visit on a daily basis looking for interesting news bits, opinions, funny or insightful stories, and so on. Possibly the most significant feature of blogs, however, is the concept of a “newsfeed”. A newsfeed is a data file that contains information about the articles published by a blog’s author—sort of a “ticker edition” of the blog’s contents that can be picked up remotely and provide an at-a-glance overview of what is available on it. Newsfeeds are nothing new, really. News organizations have been using them in one form or another to deliver information to their clients for decades—first through telex and now through the Internet. Some organizations, like the infamous PointCast, even used newsfeeds to deliver information to the end users’ desktops. In all these instances, however, newsfeeds were delivered on a one-off basis, using proprietary formats that varied significantly from provider to provider. As such, their usefulness was somewhat limited by the lack of a common standard that would have made it possi-
March 2003 · PHP Architect · www.phparch.com
ble for an end user to aggregate them in any meaningful way. XML To The Rescue! At the end of 1999, Netscape needed a format to distribute information about the “news channels” that their browser then supported. They came up with an XML system based on (and compatible with) the Resource Definition Format (RDF) created and maintained by the W3C, and called it RDF Site Summary (RSS). In 2000, weblog pioneer Userland Software picked up RSS and greatly enhanced its features, distancing it from RDF (although maintaining the XML angle) and including some of the functionality from Userland’s own scriptingNews format. Although Netscape eventually abandoned their RSS efforts, Userland’s version of RSS, now dubbed Rich Site Summary, has been used consistently as a means to syndicate a website’s contents. REQUIREMENTS PHP Version: 4.1 and Above O/S: Any Additional Software: XML Parsing Extension
41
FEATURES
Writing an RSS Aggregator With PHP
My personal opinion of XML is that it’s the Pokémon of computer science (boy, this is sure to earn me some flames)—I used to think that someone got out of bed one day and thought: ‘let’s take a simple concept—say, comma delimited files—and turn it into a monster so complex that we can build a whole industry around it’. For a while, XML was hailed as the Next Big Thing and you could find it everywhere, just like Nintendo’s threeframe-a-minute cartoons. When I first found out that RSS was based on XML, therefore, I cringed at what lay ahead. Luckily, I was wrong. RSS is an excellent example of an application for which XML is a perfect choice, given the wide variety of different computer systems that need to use it. Its greatest advantage is its simplicity; the contents of an entire website can be easily digested into a single RSS file that contains but a few XML elements, and the code for generating an RSS script is very simple to write in pretty much any language. Aggregating RSS With a common format to syndicate news and blog contents, the concept of news aggregator has come
into play. An RSS news aggregator is, simply put, an application capable of polling the information syndicated by an arbitrary number of RSS sources and consolidating it into a single news stream. Aggregators have the power to revolutionize the way we access and absorb news from the world that surrounds us. Instead of having to continuously scour the Web looking for news stories, one only needs to access his news aggregator and have all the information he needs at his fingertips. For me, adopting a news aggregator has meant spending less time finding news and more time reading it. Added to the often openly frank and unfiltered nature of blogs, aggregating news gives me the opportunity to both receive information from my favorite news sources on a daily basis and compare my opinions with those of other people who share my ideas—or whose opinions are antipodean to mine. Unfortunately, my experience with news aggregators has been less than a pleasant one, as I never seem to find one that satisfies my needs. As such, I’ve come up with a simple aggregator of my own, written in PHP, that can be easily modified to suit pretty much any requirement.
Figure 1: What's Inside an RSS Feed? An RSS feed is essentially a simple XML file that contains the following elements: •
•
A Channel specification, which, in turn can contain several sub -elements, such as: o A Description of the news source o A Title of the feed o A Link to the feed's homepage An arbitrary number of Items, one for each story that the feed carries. Although there is no preset limit to the number of Item elements in a feed, the specifications recommend that no more than fifteen be returned. Each item can contain: o A Title of the item o A Description o A Link to the item's referenced page
For example, the following is a simplified example of php|a's RSS feed: php|architect - The Magazine for PHP Professionals http://www.phparch.com/ <description>The Monthly PDF Magazine Dedicated to PHP Writing Efficient PHP Code http://www.phparch.com/news.php?id=92 <description>IBM releases a new tutorial on writing better PHP
March 2003 · PHP Architect · www.phparch.com
42
FEATURES
Writing an RSS Aggregator With PHP
The RSS Format The specifications for the RSS format can be found at http://web.resource.org/rss/1.0/spec. An RSS news feed is, essentially, a simple XML file that, at a minimum, contains the elements shown in Figure 1. As you can see, the main container element is Channel (a vague reminder of RSS’ initial purpose), which contains information about the originator of the news. In addition, the feed should also contain an arbitrary number of Item elements, each of which provides information for a single news story. Although there isn’t a predefined limit to the number of items that can be included in a newsfeed, the specifications recommends that no more than fifteen be sent by an originator. A news aggregator works by reading in the XML file from an arbitrary number of feeds (determined by the user), interpreting its contents and, finally, output them to the user in an aggregated format. Getting Started The news aggregator that I have written relies primarily on PHP’s ability to read remote files and parse XML contents. As a result, you will need to have both the fopen() wrappers and the XML parsing engine enabled in your installation of the PHP interpreter in order for the aggregator to work properly. Reading remote files through HTTP from PHP is a relatively simple operation. All you really need to do is open an URL location as if it were a file and read its contents. For example:
These five lines of code will open a connection to the php|a website, read the contents of its main page and output it to the caller. If you’re using PHP 4.3 or higher, you can even specify HTTPS as your protocol of choice! Parsing XML My compulsive allergy to XML makes writing a news aggregator in PHP a very pleasant experience. In fact, PHP features a complete XML-parsing system that can be used to effortlessly interpret the contents of even the most complicated XML data stream. In PHP, XML information is handled by a parser, which reads the raw XML input and analyzes its strucMarch 2003 · PHP Architect · www.phparch.com
ture, extracting the information it contains. The parser then calls several user-provided functions to deal with the contents of the file. Contrary to what many people seem to think, the goal of the parser is not to load an XML file into memory. The parser’s role is only to break down a correctly formatted XML stream into its components, thus simplifying the developer’s life, who is free to concentrate on interpreting its contents. A parser is instantiated using the xml_parser_create() function, which takes no parameters and returns a resource that can later be used to reference the newly created parser. Once its operations are complete, a parser can be destroyed by passing its associated resource as an argument to xml_parser_free(). The XML parser engine supports many different options, which can be set using the xml_parser_set_option() function: xml_parser_set_option ( resource parser, int option, mixed value);
In our case, the only option of interest is XML_OPTION_CASE_FOLDING, which causes the parser to change the case of all the XML elements and attributes it encounters so that they are all in uppercase characters. This makes interpreting the contents of the file much easier by using a simple comparison. As I mentioned earlier, the parser allows its caller to interpret the contents of an XML stream by using several callbacks. Considering the structure of an RSS file, we will be primarily interested in the XML elements themselves, so that we can determine when an item begins and ends, and the contents of the Description element. The xml_set_element_handler() function is used to specify the two callbacks that the parser should use when it encounters the beginning of an element and its end: xml_set_element_handler ( resource parser, string element_start_function, string element_end_function);
The element_start_function and element_end_function parameters contain the names of the functions that the parser should execute when a new element begins and ends respectively. For example: xml_set_element_handler ( $parser, "StartElement" "EndElement" )
43
FEATURES
Writing an RSS Aggregator With PHP
This would require that StartElement() and EndElement() had been previously declared as follows: StartElement ( resource parser, string name, array attributes); EndElement ( resource parser, string name);
Although I won’t use this feature in the code I’m presenting as part of this article, I should also point out that, instead of function names, you can also pass an associative array to xml_set_element_handler() as the value of either element_start_function or element_end_function. If you do so, whenever the parser encounters a particular element, it will call the function associated with it in the array. For example, if you pass this array as the value of element_start_function: (
In both cases, the [parser] and [name] parameters hold a reference to the calling parser and the name of the element that is starting or ending. As far as StartElement is concerned, the [attributes] parameter contains an associative array of any attributes that are part of the element declaration. For example, the following: would cause the attributes parameter to contain the array [‘href’ => ‘http://www.phparch.com’)].
Exploring XSLT Processing Options Within PHP By Stuart Herbert
XSLT is a W3C standard for turning XML into HTML, or even another XML document. Thanks to Sam Ruby’s Java extension for PHP, and the work of JSR-063, taking full advantage of XSL is very easy
Introductions I became interested in using XSLT from within PHP when I was designing docXP, a Javadoc clone for PHP programmers. docXP generates an XML description of PHP source code, and then uses XSLT to transform that XML into HTML. I found that PHP’s XSLT support was not robust enough for my purposes. Rather than give up on my project, I decided to find a way to combine PHP with Java’s many excellent implementations of the XSLT standard. In this article, I’m going to share with you the approach I discovered for processing XSLT, by covering: · what XSLT is, and why it’s sometimes useful · the various ways of performing XSLT processing (Sablotron, calling Java apps from the command line), and the problems (robustness, performance) · the Java extension for PHP · how to configure the Java extension for Windows and Linux
March 2003 · PHP Architect · www.phparch.com
· the TRaX interface defined in JSR-063 · implementing XSLT processing using that interface · where to get good XSLT processors from I hope you’ll find this information helpful. Before We Start To make full use of the examples listed in this article, you will need a copy of PHP 4.3.0, with the XSLT and Java extensions compiled in, and with Sablotron support enabled. Any version of PHP 4.1.x or greater should work, but I have not tested my code with them. For Windows users, the pre-compiled PHP 4.3.0 binaries for Windows includes both extensions, and the Sablotron library. Make sure that you uncomment the REQUIREMENTS PHP Version: Java 1.4.x, PHP 4.3.0 with Java O/S: Any Additional Software: XSLT and Sablotron support enabled
49
FEATURES
Exploring XSLT Processing Options Within PHP
following lines in your php.ini file: Listing 1 ;;;;;;;;;;;;;;;;;;;;;; ; Dynamic Extensions ; ;;;;;;;;;;;;;;;;;;;;;; ; ; If you wish to have an extension loaded ; automaticly, use the following syntax: ; ; extension=modulename.extension ; ; For example, on Windows: ; ; extension=msql.dll ; ; ... or under UNIX: ; ; extension=msql.so ; ; Note that it should be the name of the ; module only; no directory information ; needs to go here. Specify the location ; of the extension with the extension_dir ; directive above.
;Windows Extensions ;Note that MySQL and ODBC support is now ;built in, so no dll is needed for it. ; extension=php_java.dll extension=php_xslt.dll
During the article, I’ll show you how to configure the Java extension to work with your copy of Java. I recommend downloading the J2SDK-1.4.1-01 from Sun’s website http://java.sun.com/. I’ll introduce you to other useful downloads when they’re discussed in the article. At times in the article, I refer to classes being singletons and facades. If you’re not familiar with object-oriented jargon, you probably won’t know that these are examples of design patterns. I’ll explain what each pattern is when we come across it. Addison-Wesley published the most famous book on the subject, called Design Patterns. If you’re interested in learning more, the Addison-Wesley book is a good introduction to the subject. What Is XSLT? XSL is the XML stylesheet language from the W3C. It consists of three parts: XSL Transformations (XSLT), documented at http://www.w3.org/TR/xslt; XML Path Language query language (XPath), documented at http://www.w3c.org/TR/xpath; and XSL Formatting Objects (XSL-FO), documented at http://www.w3c.org/TR/xsl. XSLT allows us to transform one XML document into one (or more!) XML documents. It is often used to transform XML into HTML on websites with dynamic March 2003 · PHP Architect · www.phparch.com
content. The advantage is that the look and feel of the website can be changed over time without having to change the code that generates the dynamic content. One example website that uses this technique is the Gentoo Linux website http://www.gentoo.org/. Here, they use XML to publish the same tutorial content to the web as HTML, and to other output formats such DocBook (see Gentoo Linux). I’ve created PHP classes that generate an XML description of PHP source code. Listing 2 contains an actual XML document created by my classes. I’m interested in using XSLT to turn the XML into HTML. Listing 3 contains the XSLT file I wish to use on the XML document in Listing 2. You can find these listings in the listings folder that comes with your copy of php|architect. XSLT Processing From PHP The XSLT Extension I started out by looking at what built-in support PHP has for XSLT processing. PHP 4.3.0 comes with built-in support for processing XSLT templates, thanks to the XSLT Extension and the Sablotron library. The XSLT Extension, added in PHP 4.1.0, provides a single API for accessing all of the XSLT processing engines that PHP supports. Listing 4, taken from my docXP source code, shows how to use the API. The basic flow is very straight forward: 1. Call xslt_create() to obtain a handle to an XSLT processor. The handle must be passed into all the other xslt_ functions. $h = xslt_create();
2. Call xslt_process() to perform the XSLT processing. If your XML and XSLT files are on disk, and you wish the output to go to disk, then you can do: xslt_process($h, "myxmlfile.xml", "myxslfile.xsl", "myoutputfile.html");
At the other extreme, if $xml contains your XML document, $xsl contains your XSLT document, and you want the output to go to $output, you would do it this way:
3. Call xslt_free() to release the handle to the XSLT processor. xslt_free($h);
You can use xslt_process() to process XML and XSLT held in files on disk, strings in memory, or some combination of the two. PHP and Sablotron PHP 4.3.0 only supports one XSLT processing engine Sablotron. Created by The Ginger Alliance, Sablotron is written in C, and is freely available for both Windows and UNIX operating systems. If you are using PHP on Windows, make sure that you download the latest version of Sablotron directly from the Ginger Alliance’s website. The pre-compiled PHP binaries do not always ship with an up to date version of Sablotron. I’ve found Sablotron to be very fast. Unfortunately, I’ve also found PHP to fatally crash from time to time when using Sablotron on Windows XP. (See http://bugs.php.net/22147) Now, fatal crashes are a big deal for me, as an engineer who makes a living using PHP. And here’s why. Diagram 1. contains captured dialog boxes from one such crash of PHP 4.3.0 for Windows, as downloaded from http://www.php.net/. Take a moment to really pay attention to the text in the dialog boxes. There are two key phrases I want you to think about. 1. If you were in the middle of something, the information you were working on might be lost. 2. We are sorry for the inconvenience. Your customers will not appreciate fatal errors in the middle of something, and may find the inconvenience bad enough to vote with their mouse and take their business elsewhere. Do forgive me for this, but I’d like to see PHP do better on this. The XSLT standard was first published in draft form on 18th August 1998, and finally approved on 16th November 1999. PHP 4.3.0 was released three years later, on 27th December 2002. Isn’t it time that PHP could be relied upon to support this standard? At least without crashing? I think so. Heck – I’d settle for some XSLT transforms not working quite right – so long as PHP didn’t crash any more. Bugs like that can be worked around. All one can do for now is hope that it’s being worked on as we speak.
March 2003 · PHP Architect · www.phparch.com
Exploring XSLT Processing Options Within PHP Until then, I still need a way of performing XSLT processing using PHP. If PHP’s built-in functionality can’t be relied upon to do the job (yet), I need to use external XSLT processing engines instead. Calling Other XSLT Processor Engines The next thing I did was to download a stand-alone XSLT processor that I could run manually on the command-line. I chose Instant Saxon, available from mainly http://saxon.sourceforge.net/, because it’s written by Michael Kay. I use one of Michael’s books – XSLT Programmer’s Reference from http://www.wrox.com/books/1861003129.htm - as my desktop bible on the XSLT standard. Listing 5, taken from my docXP source code, shows how to call Saxon in this way. This approach has one very obvious problem. Having PHP execute a stand-alone XSLT processor is many times slower than using Sablotron. There’s the overhead of the temporary files on disk (or tmpfs for canny Linux users), and there’s the fork()/exec() overhead of executing the XSLT processor as a separate process. I wouldn’t recommend it to anyone, unless performance really did not matter at all. But it proves the point – that we can go beyond what PHP offers if we really need to. Instant Saxon is part of a larger project – Michael Kay’s Saxon XSLT processor for Java. Saxon implements a set of classes that other Java programs can call on. It also has the advantage that it’s been around for years, and is as standards-compliant and robust as they come. If we could call those classes from PHP, that would be great. That would give us XSLT processing using one of the most respected XSLT processing engines around. If we could call those classes without the performance problems of using a command-line XSLT processor, that would be better still. We’d be able to do all the XSLT processing we want, without having to use either the XSLT Extension, or the Sablotron library. Well, I found out that we can do all this. And we have Sam Ruby to thank for it. Introducing The Java Extension Sam Ruby contributed a very useful Extension to PHP 4.0. He wrote a very simple Extension that would allow PHP programs to create and use Java objects as if they were PHP objects. All we have to do is create a new Java() object, and the rest is done by smoke and mirrors. It’s simplicity at its best. See Listing 6 for an example of the syntax. Curious as to how it works? Here’s an overview of what happens: · When we use the Java() keyword for the first time, the Java Extension starts up a Java Virtual Machine (or JVM) in the background, using a programming interface called Java Native
51
FEATURES
Exploring XSLT Processing Options Within PHP
Interface (or JNI). The JVM runs inside our PHP process. Once loaded, the same virtual machine is re-used time and time again until our PHP process ends. We can compile in the Java Extension – or load it as a shared object – and if we don’t use the Java() keyword at all, the Extension does nothing. · The Java Extension then uses Java’s reflection capabilities to examine the Java object we’re trying to create. A PHP object (of type java) is created, and given the same functions as the Java object. · We just call the PHP object, and the Java Extension automatically passes everything through to the JVM and back again. It’s worth noting here that all parameters to functions are passed by value. We’ll return to this point later in the article. So far, so good. This is great for people like me, who want to do XSLT processing from inside a PHP command-line script. It’s also great for people who want to use XSLT as part of their website – provided you have compiled PHP as a module for your web server. The overhead of starting up the JVM occurs just once per web server process – the very first time the Java() keyword is used . And, because web servers such as Apache try to re-use their processes as much as possible, this means that the Java Extension very rarely starts up or shuts down. If you’re using PHP as a CGI program, then you’ll probably have the overhead of starting the JVM every time your CGI script executes. You may find that the overhead is just too much for a busy web site. But why not give it a try, and find out for yourself?
A quick read of the Java Extension’s documentation in the PHP manual is enough to put most people off taking the Extension out for a spin. This part of the otherwise excellent PHP Manual really stands out for being difficult to understand, and for a long list of comments by PHP users who just cannot get the Extension to work. I think that this is a shame. I found it straight forward to configure the Java Extension (after reading the PHP source code!), and Sam did a nice job of making the Extension as easy to use as possible. I hope this article goes a long way to helping other PHP users take full advantage of this powerful and practical Extension – and to make sure of it, I’ll show you how to configure the Java Extension for Windows, and for Linux. Configuring the Java Extension for Windows To use the Java Extension on Windows, you have to correctly setup a number of configuration options in your php.ini file. Ignore what the PHP manual says. You only have to setup two options. Php.ini file setting Purpose java.library
Set this to point to the DLL of the Java Virtual Machine you wish to use
java.class.path
Set this to be the CLASSPATH you want the Java Virtual Machine to use. The CLASSPATH must include the php_java.jar file, or the Extension will not work.
Listing 7 contains working settings from my own php.ini file on Windows. Let’s go through each of these in more detail.
Configuring the Java Extension
1. java.library The Java Extension works by running a Java Virtual Machine (or JVM) inside your copy of PHP. To do this, the Extension needs to know which virtual machine
Listing 7: php.ini settings ; PHP is installed into c:\php ; JDK 1.4.1_01 is installed into c:\j2sdk1.4.1_01 ; ; These settings in the php.ini file work for me [Java] ; this is the virtual machine to use java.library = c:\j2sdk1.4.1_01\jre\bin\client\jvm.dll
to use. In JDK 1.4.1-01, which is what I’m using to write this article, the DLL containing the JVM is in the jre/bin/client/jvm.dll file on disk. We have to set the java.library option to point to the JVM we wish to use. 2. java.class.path The Java Virtual Machine needs some help in talking to PHP. The Java Extension, written in C, does a lot of the work. The rest of the work has been written in Java, and is compiled into the file php_java.jar. By default, this is in the extensions directory. We have to add the full path to the php_java.jar file to the java.class.path option in the php.ini file.
have one, and that you know where it is. If you do not have a copy, it is created in ext/java when you compile PHP 4.3.0 using the —with-java option. To use the Java Extension on Linux, you have to correctly setup a number of configuration options in your php.ini file. Ignore what the PHP manual says. You have to setup three options. Php.ini file setting Purpose extension_dir
Remember also to add any other JAR files that you need! Although the PHP Manual does mention other php.ini options, they are not required on Windows to use PHP 4.3.0 with JDK-1.4.1-01. From reading the PHP source code, it appears that some options are required for supporting older Java releases, and others are there to give the JVM a hint in unusual circumstances. Configuring the Java Extension for Linux I use Gentoo Linux myself. Out of the box, Gentoo Linux does not (at the time of writing) compile or install the Java Extension. Gentoo Linux uses scripts called ebuilds to install packages such as PHP. I’ve submitted updated PHP CLI and Apache module ebuilds to fix this, which are attached to http://bugs.gentoo.org/show_bug.cgi?id=15574
and http://bugs.gentoo.org/show_bug.cgi?id=15650.
These new ebuilds also automatically configure your php.ini file for you. If you use either of these ebuilds, you can skip the rest of this section. If you use another Linux distribution, check with your Linux vendor to see whether they provide pre-compiled PHP 4.3.0 with the Java Extension enabled. Before you can configure your copy of PHP, it is important to have a quick look at two things first: 1. Find the PHP extension file java.so (normally found somewhere under /usr/lib/php/extensions). Create a symbolic link to this file, and call the symlink libphp_java.so. > cd /usr/lib/php/extensions/?? > ln -s java.so libphp_java.so Under Linux, the Java Extension does not work unless the extension file is called libphp_java.so. The Java Virtual Machine itself throws a java.lang.UnsatisfiedLinkError exception. 2. Find the file php_java.jar, which should have been installed with PHP. Make sure you March 2003 · PHP Architect · www.phparch.com
extension
Set this to point to the directory containing the libphp_java.so PHP extension. Add the following line to load the Java Extension. extension=java.so
java.library
Set this to point to the shared library containing the Java Virtual Machine you wish to use.
java.library.path
Set this to point to the directory containing the compiled libphp_java.so PHP extension.
java.class.path
Set this to be the CLASSPATH you want the Java Virtual Machine to use. The CLASSPATH must include the php_java.jar file, or the Extension will not work.
Listing 8 contains working settings from my own php.ini file on Gentoo Linux. Let’s go through each of these in more detail. 1. extension_dir The Java Extension can only be compiled as adynamically loadable PHP extension. PHP needs to know which directory to look into to find such extensions. 2. extension PHP needs to know which extensions to dynamically load. You can list as many extensions as you need. We need to tell PHP to load the Java Extension. 3. java.library The Java Extension works by running a Java Virtual Machine (or JVM) inside your copy of PHP. To do this, the Extension needs to know which virtual machine to use. In Sun’s JDK 1.4.1-01, which is what I’m using to write this article, if your Java compiler is /opt/j2sdk-1.4.1-01/bin/java, the shared
53
FEATURES
Exploring XSLT Processing Options Within PHP
library containing the JVM is in the /opt/j2sdk-1.4.1-01/jre/lib/i386/libjava.so file on disk. We have to set the java.library option to point to the JVM we wish to use. 4. java.library.path Under Linux, the Java Virtual Machine 5. java.class.path The Java Virtual Machine needs some help in talking to PHP. The Java Extension, written in C, does a lot of the work. The rest of the work has been written in Java, and is compiled into the file php_java.jar. By default, this is in the extensions directory. We have to add the full path to the php_java.jar file to the java.class.path option in the php.ini file. Remember also to add any other JAR files that you need!
Listing 6 push(1); # # Should succeed and print out "1" # $result = $stack->pop(); $ex = java_last_exception_get(); if (!$ex) print "$result\n"; # # Should fail - note the "@" eliminates # the warning # $result=@$stack->pop(); $ex=java_last_exception_get(); if ($ex) print $ex->toString(); # # Reset last exception # java_last_exception_clear(); ?>
If you get this error: Although the PHP Manual does mention other php.ini options, they are not required on Linux to use PHP 4.3.0 with JDK-1.4.1-01. From reading the PHP source code, it appears that some options are required for supporting older Java releases, and others are there to give the JVM a hint in unusual circumstances. Testing The Java Extension We’ve completed all the configuration steps – now for the acid test. Does it work? Listing 6 comes with the source code of PHP, and can be found as the ext/java/except.php file. If you run this, you should get the output shown in Listing 9. Listing 9 1 java.util.EmptyStackException
Fatal error: Unable to create Java Virtual Machine in on line <X>
this means that your java.library setting in php.ini does not point to a Java Virtual Machine. The Java Virtual Machine is a DLL on Windows, and a .so file on Linux. If you get this error: Fatal error: java.lang.NoClassDefFoundError: net/php/reflect in on line <X>
this means that your java.class.path setting in php.ini does not include the php_java.jar file. The php_java.jar file lives in the extensions sub-directory of your PHP installation. All being well, you’ve now got everything working,
Listing 8: php.ini setting on Gentoo Linux ; ; ; ; ; ;
php.ini settings for Gentoo Linux & Java Extension taken from /etc/php4/php.ini change the directories to suit your Linux flavour Stuart Herbert, 2003/02/12
; the Java Extension is available only as a dynamically-loaded ; extension extension_dir = /etc/php4/lib extension = java.so ; settings for loading the Java Virtual Machine [java] java.class.path = /etc/php4/lib/php_java.jar java.library = /opt/sun-jdk-1.4.1.01/jre/lib/i386/libjava.so java.library.path = /etc/php4/lib
March 2003 · PHP Architect · www.phparch.com
54
FEATURES and we can start figuring out how to call our favourite XSLT processor from within PHP. Using The Java Extension In The Rest Of The Article To use the code listed in the rest of the article, you’ll need to download and install an XSLT processor written in Java. For this article, I am using Saxon v6.5.2, available from http://saxon.sourceforge.net/. You need to add the saxon.jar and saxon-jdom.jar JAR files to the java.class.path option in your php.ini file. Listing 8. shows what that looks like on my machine. At this point, we could just dive in, and write a piece of PHP code to drive our favourite XSLT processor. In fact, this is what I did at first. But what happens when we want to change XSLT processors? The PHP code that we would have written would no longer be valid, and would need changing. It would be much better if we could write a PHP <=> XSLT interface just the once, and then re-use it with any XSLT processor. Well, we can, if we follow the work of JSR-063. Using The Java Extension To Perform XSL Transformations JSR-063 – A Standard Way To Perform Transformations In Java Java Specification Requests (JSRs) are the way in which ordinary developers in the Java community can request additions to the Java language and/or libraries. The process is a mixture of public participation and expert working groups. You can find out more about the process at http://jcp.org/introduction/owverview. There’s no direct equivalent in the PHP world (more’s the pity), although I suppose that PEAR comes close in some ways. JSR-063 is a request for a standard API for performing XML processing. Java programmers will already be familiar with the results – the JAXP API for XML parsers. JAXP provides a standard API for accessing, and using, XML parsers from different vendors. This allows Java developers to change their XML parser without having to change their code. The working group has also defined a standard API for accessing, and using, XML transformation (little t) engines – the TRaX API. (I don’t know what TRaX stands for – it’s not explained in the JSR-063 document.) TRaX is designed to know nothing about the actual transformation that takes place. It is not limited to XSL Transformations. Theoretically, there’s no reason why it couldn’t be used to access a translator from English to another language, for example, if such a beast existed. I decided to use the TRaX API to perform XSLT processing from within PHP.
March 2003 · PHP Architect · www.phparch.com
Exploring XSLT Processing Options Within PHP Introducing The TRaX API Recommended reading: The latest documentation for the TRaX API can be found at http://jcp.org/aboutJava/communityprocess/r eview/jsr063/. Make sure you get the latest PDF document. The TRaX API documented in the original v1.1 standard is not supported by XSLT processors such as Saxon. I spent several frustrating hours finding this out the hard way!
Unlike PHP, Java programs consist entirely of classes arranged in a strict hierarchical order called packages. Each class must belong to a package. Classes in the TRaX API can be found in the javax.xml.transform package : Package
Contents
javax.xml.transform
The classes required for performing XSL transformations
javax.xml.transform.dom
Classes for interfacing with a DOM XML parser
javax.xml.transform.sax
Classes for interfacing with a SAX XML parser
javax.xml.transform.stream
Input and output classes
These classes do not ship with the Java Development Kit (JDK). They have to be implemented individually by each XSLT processor. You can see an example implementation of the TRaX API in the source code to Saxon. Performing XSL Transforms using the TRaX API For our simple needs, we do not need to know everything about all of the classes inside the javax.xml.transform package. We are primarily concerned with two classes: 1. javax.xml.transform.TransformerFactory is a class that generates objects (called Transformer’s) that perform XSLT processing. 2. java.xml.transform.Transformer is the class that (as far as we are concerned) performs the XSLT processing. It couldn’t be simpler. We use TransformerFactory to get ourselves a Transformer, and then we use the Transformer to do our XSLT processing. Well, this is
55
FEATURES Java, and Java wouldn’t be Java if things weren’t just a little more complicated than they first appear. The TransformerFactory is a great fan of Christopher Lambert films – there can be only one! It is a singleton class. Only one TransformerFactory can exist inside the JVM at any one time. This is achieved by two things: 1. The constructor for TransformerFactory is package protected. Only code belonging to javax.xml.transform can create a TransformerFactory object. It goes without saying that none of our code belongs to the javax.xml.transform package. Run Listing 10. to see what happens when you try to create a TransformerFactory object from PHP. The error message you get may vary, depending on your version of PHP and your version of Java. 2. Classes that do not belong to javax.xml.transform can obtain a TransformerFactory object by calling the static function TransformerFactory::newInstance() . It also goes without saying that, from PHP, we cannot call static methods directly. We can only call static methods on a successfully created Java() object. And we’ve already proven that we cannot successfully create a TransformerFactory object. We’re left with just one way forward. We need to create the TransformerFactory object from inside a Java class. So that’s what I did next. I wrote my own class in Java to act as a façade to the TRaX API. A Façade For Accessing TRaX From PHP A façade is a false-front; it pretends to be one thing, but underneath it is something else. We need a Java class that pretends that using TRaX from within PHP is simple, but underneath has all the gory detail that we can’t deal with directly. Take a look at Listing 11. It contains the source code for a Java class that is very usable from within PHP. Before we look at how the TRaX API itself works, let’s look at what my jsr63.java class offers. 1. You can run it from the command line (assuming that the class is in your CLASSMarch 2003 · PHP Architect · www.phparch.com