XML technologies | XPath | XPath introduction

XML technologies / XPath / XPath introduction / Addressing

Addressing

Addressing is a formulation which allows the processor to access and select nodes or node sets. Such addressings are also called "location paths" and may have different forms:

with a short or a detailed notation,
as relative or absolute location paths.

Short or detailed notation

There is a short and a detailed notation for the nodes and axes which have already been mentioned. In practice both notations are used in a mix. The short notation is usually used for nodes of axes which are accessed very often, whereas the detailed notification is used for the nodes and axes being more rare. In the following sections, you will see the detailed notation and also the short notation, when it is often used.

One example in detailed notation:

/child::book/child::author/child::name/attribute::lastName

in short notation:

/book/author/name/@lastName

When using the short notation, the axis name child:: is omitted and the attribute node is designated with the @ character. Later you can read more about this.

Detailed notation	Short notation
child::	"Default" axis, can be left out
attribute::	@
descendant-or-self::node()/	//
self::node()	.
parent::node()	..

Table: detailed and short notations of nodes and axes

Relative and absolute paths

It is possible to write a relative or an absolute (starting from the root element) location path. Strictly speaking, the absolute location paths emanate from the root node – the node which is directly above the root element. If this distinction is not made and only the root element is the starting point, it would not be possible to address, for example, process instructions or comments which are outside the root element.
The location paths are made up of several steps which are separated from each other by slashes.

Example:

/child::EUROPE/child::COUNTRY/child::NAME

This example shows an absolute path which starts from the root node. The root node is indicated by the first slash. From there, the <EUROPE> root element is selected. It must have a <COUNTRY> child element, which again must have a <NAME> child element in order to get a match. The axes are indicated along with their names followed by two colons. In the case of child::, the indication of the axis can be omitted. As a consequence, the /EUROPE/COUNTRY/NAME path is of equal value (short notation). Whereas relative paths require a context node. The path is evaluated relative to this node:

Example:

child::COUNTRY/child::NAME

In this example a match only occurs if the context node has a child element <COUNTRY> which in turn contains a <NAME> element. Here, the description for relative paths is incomplete! Please refer to specialist literature concerning XPath.

XPath examples in the context of a XSLT stylesheet

Our exercise instance (Europe.xml):

<?xml version="1.0" encoding="UTF-8"?>
<Europe>
    <Country>
        <Name>Germany</Name>
        <Population Unit="Millions">82,4</Population>
        <Capital>Berlin</Capital>
        <CountrySymbol>D</CountrySymbol>
        <CallingCode>0049</CallingCode>
    </Country>
    <Country>
        <Name>France</Name>
        <Population Unit="Millions">58,5</Population>
        <Capital>Paris</Capital>
        <CountrySymbol>F</CountrySymbol>
        <CallingCode>0033</CallingCode>
    </Country>
    <Country>
        <Name>Spain</Name>
        <Population Unit="Millions">39,4</Population>
        <Capital>Madrid</Capital>
        <CountrySymbol>E</CountrySymbol>
        <CallingCode>0034</CallingCode>
    </Country>
</Europe>

This is a XSLT example with XPath addressings:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:template match="/">                                                  (1)
      <html>
         <title></title>
         <body>
            <h3>
               <xsl:apply-templates/>
            </h3>
         </body>
      </html>
   </xsl:template>
   <xsl:template match="EUROPE">                                             (2)
        <xsl:if test="COUNTRY/NAME">                                         (2)
            MATCH : /EUROPE/COUNTRY/NAME
            <br/>
        </xsl:if>
        <xsl:if test="COUNTRY/POPULATION">                                   (3)
            MATCH : /EUROPE/COUNTRY/POPULATION
            <br/>
        </xsl:if>
        <xsl:if test="COUNTRY/CAPITAL">                                      (4)         
            MATCH : /EUROPE/COUNTRY/CAPITAL
            <br/>
        </xsl:if>
        <xsl:if test="child::POPULATION">                                    (5)
            MATCH : /EUROPE/POPULATION
            <br/>
        </xsl:if>
        <xsl:if test="descendant::POPULATION">                               (6)
            MATCH : /EUROPE/descendant::POPULATION
            <br/>
        </xsl:if>
        <xsl:if test="COUNTRY/NAME/following-sibling::POPULATION">           (7)
            MATCH : /EUROPE/COUNTRY/NAME/following-sibling::POPULATION
            <br/><br/><br/><br/>
        </xsl:if>
        <xsl:if test="COUNTRY/NAME/following-sibling::POPULATION/            (8)
            parent::COUNTRY/descendant::CAPITAL">
            MATCH : /EUROPE/COUNTRY/NAME/following-sibling::POPULATION/
            parent::COUNTRY/descendant::CAPITAL
            <br/><br/><br/><br/>
        </xsl:if>
        <xsl:apply-templates/> 
    </xsl:template>
    <xsl:template match="COUNTRY">                                           (9)
        <xsl:if test="CAPITAL"> MATCH : /EUROPE/COUNTRY/CAPITAL              (9)
            <br/>
        </xsl:if> 
        <xsl:if test="parent::EUROPE"> MATCH : /EUROPE/COUNTRY/parent::EUROPE  (10)
            <br/>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

(1) The / character indicates the root node.

(2) The <xsl:template> element refers to the <EUROPE> root element. It is the context node of the following <xsl:if> elements of the template. This means that all following location paths emanate from this initial point. In the first example it is COUNTRY/NAME. Since there is a <COUNTRY> child element of <EUROPE> which has a <NAME> child element itself, the instructions being within the <xsl:if> element are executed. In the display the absolute path: /EUROPE/COUNTRY/NAME appears.

(3) Another example for demonstrating the position of the context node.

(4) Another example for demonstrating the position of the context node.

(5) Here, no output occurs because the <EUROPE> context node has no <POPULATION> child element. It would have been sufficient to write POPULATION since, according to the convention, it can be representative for child::POPULATION.

(6) Here again an output occurs because there is a <POPULATION> descendant of <EUROPE>.

(7) At this position the processor searches for a following sibling element (following-sibling::)<POPULATION> of <NAME> which also has <COUNTRY> as parent element. Since the <POPULATION> element actually follows <NAME>, the output occurs.

(8) The search pattern shown here shall demonstrate that the complexity can be increased as often as required. Here is a search for an element which is a child element with the name <COUNTRY> of the <EUROPE> context node. This element in turn shall contain a <NAME> child element having a following <POPULATION> sibling element. Up to this point the query corresponds to the previous one and we know that it matches. The <POPULATION> element shall now have the <COUNTRY> element as parent element which shall have a <CAPITAL> descendant. As you can see in the output, all conditions are met.

(9) At this position the new template ensures that the context node is another one. Now the search patterns within the template refer to <COUNTRY>. There is a query at this point whether the <COUNTRY> element has a <CAPITAL> child element.

(10) Here is a search for a <EUROPE> parent element of the <COUNTRY> context node.

Now we get the following result:

<html>
    <title></title>
    <body>
        <h3> 
            MATCH : /EUROPE/COUNTRY/NAME                                   (2)
            <br/> 
            MATCH : /EUROPE/COUNTRY/POPULATION                             (3)
            <br/>
            MATCH : /EUROPE/COUNTRY/CAPITAL                                (4)
            <br/> 
            MATCH : /EUROPE/descendant::POPULATION                         (6)
            <br/> 
            MATCH : /EUROPE/COUNTRY/NAME/following-sibling::POPULATION     (7)
            <br/><br/><br/><br/>
            MATCH : /EUROPE/COUNTRY/NAME/following-sibling::POPULATION/    (8)
            parent::COUNTRY/descendant::CAPITAL 
            <br/><br/><br/><br/> 
            MATCH : /EUROPE/COUNTRY/CAPITAL                                (9)
            <br/> 
            MATCH : /EUROPE/COUNTRY/parent::EUROPE                         (10)
            <br/> 
            MATCH : /EUROPE/COUNTRY/CAPITAL                                (9)
            <br/> 
            MATCH : /EUROPE/COUNTRY/parent::EUROPE                         (10)
            <br/> 
            MATCH : /EUROPE/COUNTRY/CAPITAL                                (9)
            <br/> 
            MATCH : /EUROPE/COUNTRY/parent::EUROPE                         (10)
            <br/>  
        </h3>
    </body>
</html>

In certain typesetting designs (e.g. in this book) it is common practice to indent the first line of a paragraph by a specific value. But not after immediately preceding headlines being struck at the left border of the print space. The first of the following templates is responsible for the layout of the paragraphs in general and the second template applies for all paragraphs which directly follow the headline. The paragraph element is called <para>, the headline element is called <title>.

<xsl:template match="para">
   <fo:block text-indent="6mm"><xsl:apply-templates/></fo:block>
</xsl:template>
<xsl:template match="para[preceding-sibling::*[1][self::title]]">
   <fo:block text-indent="0mm"><xsl:apply-templates/></fo:block>
</xsl:template>

Here is a small collection of paths for further practice:

child::EUROPE returns the child elements of the <EUROPE> element.
child::node() returns the child nodes of the context node.
attribute::UNIT is the detailed notation for the common @UNIT and returns the content of the UNIT attribute of the context node.
descendant::EUROPE returns the descendants of the <EUROPE> element.
self::EUROPE returns itself when the context node is the <EUROPE> element.
/ returns the root element.
//*/@* returns any attributes of any element.
/child::comment() returns all comment nodes which are children of the root node.

Addressing of different node types

So far we have focused heavily on the addressing of element nodes. Apart from elements also all other node types can be selected and used for evaluation, processing and output. The following table with examples shows how to perform this node selection.

name/text()	When the text node of the name element shall be addressed, this is done via the text() node test.
adresse/comment()	When the comment node of the adresse (address) element shall be addressed, this is done via the comment() node test.
ort/processing-instruction()	When the PI node of the ort (location) element shall be addressed, this is done via the processing-instruction() node test.
adresse/node()	When all node types (except for attributes) shall be addressed, this is done via the node() node test.

<< back

next >>

Copyright © dpunkt.verlag GmbH 2007
Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "Professionelle XML-Verarbeitung mit Word" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.

dpunkt.verlag GmbH, Ringstraße 19B, 69115 Heidelberg, fon 06221-14830, fax 06221-148399, hallo(at)dpunkt.de