The XProc concept
XProc is a language in XML which provides a set of commands for the flow control in order to generate XML oriented workflows. A XProc document is read by a processor which executes these commands sequentially. The language uses several techniques which are performed in sequence within a pipeline (as so-called steps).
Figure: simple example of a pipeline
So, on the one hand XProc has a certain internal amount of possibilities to manipulate or process XML, on the other hand it is able to integrate external technologies (e.g. a XSLT transformation or Schematron validations) as individual processing steps.
Structure of a XProc document
<?xml version="1.0" encoding="UTF-8"?> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0"> <p:input port="source"> <p:inline> <doc>Hello World!</doc> </p:inline> </p:input> <p:output port="result"/> <p:identity/> </p:declare-step>
The code excerpt in the figure shows the typical structure of a XProc document. The functions used in this example will be explained in later chapters.
The root element of a XProc document can be <p:declare-step> or <p:pipeline>, the latter being a simplified variant since less information has to be provided here. The exact differences of both root elements are also discussed later. However, this much shall already be revealed: <p:declare-step> is to be preferred because the user has bigger means of control.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
When creating the <p:declare-step> root element, the appropriate XProc namespaces are declared and the version number of XProc is indicated.
<p:input port="source"> <p:inline> <doc>Hello World!</doc> </p:inline> </p:input>
With <p:input> the input port is defined. The “port“ attribute has to get a value because this value describes the name of the <p:input> port which could be relevant during further processing stages. By convention “source“ is used in this example. By using <p:inline>, own documents can be written directly in the XProc stylesheet. In this example a <doc> element is opened containing the typical “Hello World“ slogan.
<p:output port="result"/> <p:identity/>
With <p:output> the output port is generated. Here, the end of the pipeline is defined. As with input, a name is assigned to the “port“ attribute. The “result“ value being preset by convention is entered. With <p:identity> the content is copied from the input port and directly outputted. The running of this XProc script results in the following:
<doc xmlns:c="http://www.w3.org/ns/xproc-step">Hello World!</doc>