Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
To take an example of how this is a problem in practice, suppose that the stylesheet defined a variable as follows:
Now suppose that this variable is never referenced, or is referenced only as
$dummy[1]
. Is the result document produced, or not? Normally, an XSLT optimizer will avoid evaluating variables or parts of variables that aren't used, but this strategy causes problems if the evaluation of a variable has a side effect.
The way that the XSLT specification has dealt with this problem is essentially to say that you can only use
XSLT processors are allowed to evaluate instructions in any order. This means that you can't reliably predict the order in which final result trees get written. There is a rule preventing a stylesheet from writing two different result trees with the same URI, because if overwriting were allowed, the results would be nondeterministic. There is also a rule saying that it's an error to attempt to write a result tree and then read it back again using the
document()
function: this would be a sneaky way of exploiting side effects and making your stylesheet dependent on the order of execution. In practice, processors may have difficulty detecting this error, and you might get away with it, especially if you use different spellings of the URI, for example by writing to
file:///c:/temp.xml
and then reading from
FILE:///c:/temp.xml
.
The fact that order of execution is unpredictable has another consequence: if a transformation doesn't run to completion, because a runtime error occurred (or perhaps because
terminate=“yes”
), then it's unpredictable as to whether a particular final result tree was output before the termination. In practice most processors only exploit the freedom to change the order of execution when evaluating variables or functions, so you are unlikely to run into this problem in practice.
Usage
There are two main reasons for using
Generating multiple output files is very common in publishing applications. The product documentation for Saxon, for example, consists of around 450 HTML files, which are generated by a single transformation from 20 input XML files. Sometimes it's better to do the transformation in two stages: First, split a large XML document into several small XML documents, and then convert each of these into HTML independently.
One common approach is to generate one principal output file and a whole family of secondary output files. The principal output file can then serve as an index. Usually a key part of the process will be the generation of hyperlinks that allow the user to navigate within the document family. This means you will need some mechanism for generating the filenames of the output files. Exactly how you do this depends on what's available in your input: one approach is to use the
generate-id()
function, which allocates a unique identifier to every node in your input documents.
Examples
The
Example: Creating Multiple Output Files
This example takes a poem as input, and outputs each stanza to a separate file. A more realistic example would be to split a book into its chapters, but I wanted to keep the files small.
Source
The source file is
poem.xml
. It starts:
…
Stylesheet
The stylesheet is
split.xsl
.
We want to start a new output document for each stanza, so we use the
version=“2.0”>
select=“concat(‘verse’, position(), ‘.xml’)”/>
To run this example under Saxon, you need to make sure that an output file is supplied for the principal output document. This determines the base output URI, and the other output documents will be written to locations that are relative to this base URI. For example:
java -jar c:\saxon\saxon9.jar -t -o:c:\temp\index.xml
-s:poem.xml -xsl:split.xsl
This will write the index document to
c:\temp\index.xml
, and the verses to files such as
c:\temp\verse2.xml.
The
-t
option is useful because it tells you exactly where the files have been written.
Output
The principal output file contains the skeletal poem below (indented for legibility):
Three further output files
verse1.xml
,
verse2.xml
, and
verse3.xml
are created in the same directory as the principal output file. Here is
verse1.xml
: