Free, tested & ready to use examples : PHP XML XSLT DomDocument importStyleSheet transformToXml
AnyExample.com
 
Webanyexample.com
 

Making XML/XSLT driven site using PHP

abstract 
XML and XSLT technologies provides standard ways of separation of presentation and data. This article contains an example of simple php "xslt engine" for XML driven web-sites which implements caching techniques and Apache-based XML file processing.
compatible 
  • PHP 5 with XML/XSL extension
  • Apache HTTP Server 1.3 or higher

First, we have to set up our engine to process every *.xml file on web server. We use following Apache directives:

config file: .htaccess, httpd.conf
AddHandler ae_xslt xml      # we add handler 'ae_xslt' associated with xml files
Action ae_xslt /ae-xslt.php # we set our script '/ae-xslt.php' with handler 'ae_xslt'
			# thus every *.xml file should pass through our script '/ae-xslt.php'

DirectoryIndex index.xml index.php index.html
			# adding index.xml to directory index list, specifies that 
			# yoursite.com -> yoursite.com/index.xml
			# yoursite.com/folder/ -> yoursite.com/folder/index.xml

In case you're curious, here is description of there directives from Apache documentation: AddHandler, Action

You may put these lines in .htaccess file in the root folder of your site (your hosting provider should allow override 'FileInfo' in .htaccess files) or directly to httpd.conf of Apache.

Our script will receive file information of processed xml files in two environmental variables: PATH_INFO (http-server path to file, like /page1.xml ) and PATH_TRANSLATED (filesystem path to file, like /var/www/htdocs/page1.xml or something )

As web-server does not check existence of the handled files, engine should do additional checking and output 404 error message if requested xml file does not exist.

Afterwards, our engine check if there is a fresh cached version of requested file. Checking is done comparing file modification times. If cached version is valid, engine outputs it and exists.

Otherwise, engine loads main XSLT file 'ae-site.xslt', loads requested xml file, does transformation and saves new cached file.

Here is the source code:

source code: php
<?php
// AnyExample XSLT Site engine

// Allow PHP to report everything
error_reporting(E_ALL);

if (!isset(
$_SERVER['DOCUMENT_ROOT']))
    die(
"Web server didn't set DOCUMENT_ROOT");

// DOCUMENT_ROOT -- is a path to your
// web site's directory with your files.
$docroot = $_SERVER['DOCUMENT_ROOT'];

// some web servers pass file information
// in PATH_TRANSLATED/PATH_INFO
// others -- in
// ORIG_PATH_TRANSLATED/ORIG_PATH_INFO
// lets check:
$sapi = php_sapi_name();

if ((
strpos($sapi, 'cgi') !== false)||($sapi == 'isapi')
    &&isset(
$_SERVER['ORIG_PATH_TRANSLATED']))
{
    
$realfile = $_SERVER['ORIG_PATH_TRANSLATED'];
    
$http_file = $_SERVER['ORIG_PATH_INFO'];
}
else
{
    
$real_file = $_SERVER['PATH_TRANSLATED'];
    
$http_file = $_SERVER['PATH_INFO'];
}


// checking if source XML file exists
if (!file_exists($real_file))
{
// File does not exist: output 404 error
header("Status: 404 Not Found"); // 404 HTTP resonse status
// 404 page below. Your may change HTML code of it.
?>
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>404 Not Found</TITLE>
</HEAD><BODY>
<H1>Not Found</H1>
The requested URL (<?php echo $http_file; ?>) was not found on this server.<P>
<HR>
<ADDRESS>Ae-XSLT by <a href="http://www.anyexample.com/">AnyExample</a>
at <?php echo $_SERVER['HTTP_HOST'];?></ADDRESS>
<!--
<?php
echo str_repeat('ie padding', 40); // extra output for Internet Exporer
?>
-->
</BODY></HTML>    
<?
exit();
}

$cached_file = $docroot.'/.cache/'.str_replace('/', '-', $http_file);
// cached_file -- files that stores generated HTML code

$xslt_file = $docroot.'/site.xslt';
// XSLT file -- file that contains XSLT template

$xml_time = filemtime($real_file);
$xslt_time = filemtime($xslt_file);
$cache_time = @filemtime($cached_file);
// Modification times of source XML file,
// XSLT file and cached file


// Compare file modification time
// If cache is created after last modification of
// both xml and xslt
if (($cache_time > $xml_time) && ($cache_time > $xslt_time))
{
    
// than we can output cached file and stop
    
readfile($cached_file);
    echo
'<!--cached-->';
    exit();
}

// Loading XML file
$source_xml = file_get_contents($real_file);

if (
strpos($http_file, '/sitemap.xml') !== false)
    echo
$source_xml; // Do not process Google's Sitemap file

// do not process empty files
if ($source_xml == "")
    die(
'Empty XML file');

// creating&loading DOMDocument
$xml = new DOMDocument;
$xml->substituteEntities = true;
if (
$xml->loadXML($source_xml) == false) // loadXML will fail
    
die('Failed to load source XML: '.$http_file); // if document is not valid XML
                     // some tags were not closed, etc.

// Loading XSLT site
$stylesheet = new DOMDocument;
$stylesheet->substituteEntities = true;
if (
$stylesheet->load($xslt_file) == false)
    die(
'Failed to load XSLT file');


// XSLT transformation
$xsl = new XSLTProcessor();
$xsl->importStyleSheet($stylesheet);
$output = $xsl->transformToXML($xml); // transforming


// in some versions of PHP internal
// XSLTProcessor and DOMDocument
// generated broken XHTLM code
// let's to our own 'htmlizing'

// htmlizing XML
$output = ltrim(substr($output, strpos($output, '?'.'>')+2)); // removing <?xml
$output = preg_replace("!<(div|iframe|script|textarea)([^>]*?)/>!s", "<$1$2></$1>", $output);
// some browsers does not support empty div, iframe, script and textarea tags
$output = preg_replace("!<(meta)([^>]*?)/>!s", "<$1$2 />", $output);
// meta tag should have extra space before />
$output = preg_replace("!&#(9|10|13);!s", '', $output);
// nobody needs 9, 10, 13 chars
$output = str_replace(chr(0xc2).chr(0x97), '&mdash;', $output);
$output = str_replace(chr(0xc2).chr(0xa0), '&nbsp;', $output);
// lets substitute some UTF8 chars to HTML entities


echo $output;
// Finally! Outputting HTML to browser

// caching (save processed version and display it next time)
@file_put_contents($cached_file, $output);        
?>

As you may see, cached files is stored in '.cache' subfolder of web-site. Make sure it exists and is writable to your PHP scripts

How to use it? Look at the XML file:

source code: xml
<?xml version="1.0"?>
<page>
    <title>Page 2</title>
    <subtitle>Famous panagrams:</subtitle>
 
    <paragraph>
    	The quick brown fox jumped over the lazy dog's typewriter.
    </paragraph>
 
    <paragraph>
    	Cozy lummox gives smart squid who asks for job pen
    </paragraph>
 
    <references>
		<item url="http://www.anyexample.com">AnyExample</item>
		<item url="http://en.wikipedia.org/wiki/Panagram">Wikipedia panagrams</item>
    </references>
</page>

Web site's main XSLT file 'ae-site.xslt' contains following template:

source code: XSLT
<?xml version="1.0"?>
 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <!-- XML output mode -->
    <xsl:output method="xml" standalone="yes" indent="no" encoding="utf-8"/> 
 
    <!-- we do not need spaces in output file -->
    <xsl:strip-space elements="*"/>
 
    <!-- this template copies unknown XML tags to output file, 
	 allows use of XHTML -->    
    <xsl:template match="@*|node()">
	<xsl:copy>
	    <xsl:apply-templates select="@*|node()" />
	</xsl:copy>
    </xsl:template>
 
    <!-- Main page template -->
    <xsl:template match="page">
    	<html>
    		<head>
    			<title><xsl:value-of select="//title"/></title>
    		</head>
    		<body>
 
    			<div style="width: 50%; padding: 8px; background-color: #DDD;">
    				<b>Menu: </b> 
    				<a href="index.xml">Main page</a>, 
    				<a href="page1.xml">Page 1</a>, 
    				<a href="page2.xml">Page 2</a>
    			</div>
 
    			<div style="width: 50%; padding: 4px; background-color: #EEE;">
    				<h1><xsl:value-of select="//title"/></h1>
 
    				<xsl:apply-templates match="content"/>
 
    			</div>
 
    		</body>
    	</html>
    </xsl:template>
 
 
    <!-- For references section -->
 
    <xsl:template match="references">
    	<ol>
			<xsl:apply-templates match="item"/>
    	</ol>
    </xsl:template>
 
    <xsl:template match="references/item"> 
    	<li>
    		<xsl:choose>
    			<xsl:when test="@url"> <!-- item tag has url attibute -->
    				<a> <!-- enclose text in <a href="" -->
    					<xsl:attribute name="href"><xsl:value-of select="@url"/></xsl:attribute>
    					<xsl:apply-templates/></a>
    				</xsl:when>
    				<xsl:otherwise> <!-- Otherwise, make text italic -->
    					<i>--<xsl:apply-templates/></i>
    				</xsl:otherwise>
    			</xsl:choose>
    		</li>
    </xsl:template>
 
    <!-- paragraph tag -->
    	<xsl:template match="paragraph">
    		<p>
				<xsl:apply-templates/>
    		</p>
    </xsl:template>
 
    <!-- subtitile tag -->
    <xsl:template match="subtitle">
    	<h2>
			<xsl:apply-templates/>
    	</h2>
    </xsl:template>
 
 
    <!-- Empty tempate: we use values from these tags in 
	 other templates  -->
    <xsl:template match="title" /> 
</xsl:stylesheet>

So, when web site visitor asks for page1.xml, XSLT transformation will substitute <page> tag to <html><head<..., set page title and H1 header from <title> tag, transform <paragraph> tag to <p>... — correctly converting page2.xml from pure XML to XHTML.

Download whole XSLT engine example site in one zip archive.

Check out other articles about XSLT on the net:





warning 
  • Your site's source XML / XSLT pages should be valid XML documents
  • Your version of PHP 5 should be compiled with XSLT extension
tested by AnyExample.com on 2007-04-17
  • FreeBSD 6.2 :: Apache 2.2.4 :: PHP 5.2.1
  • FedoraCore 3 :: Apache 1.3 :: PHP 5.0.5
 


 
© AnyExample 2007
License | Privacy | Contact