XSLT and XPath function reference in alphabetical order

(Excerpt from “XSLT 2.0 & XPath 2.0” by Frank Bongers, chapter 5, translated from German)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

regex-group

Category:

String functions – analysis and manipulation

Origin:

XSLT 2.0

Return value:

A xs:string string; that substring of a match of xsl:analyze-string which corresponds to the subgroup indicated by the integer argument passed on.

Call/Arguments:

regex-group($subgroupIndex)

$subgroupIndex:

A (positive) xs:integer integer equal to or greater than 1; indicates a subgroup marked by round brackets of the regular expression of a xsl:analyze-string instruction.

Notice: The value of $subgroupIndex may also be 0. In this case, however, no subgroup but the entire match of the regular expression is returned. Negative values of $subgroupIndex do not cause errors, but return empty strings because no appropriate subgroups exist.

Purpose of use:

During the processing of a xsl:analyse-string instruction, the XSLT function regex-group() enables the access to subgroups (current captured substrings) of the places of discovery (matches) of the regular expression of the instruction which are generated in the expression by means of bracketed subexpressions.

Within the regular expression the subgroups are numbered beginning with 1 from left to right. They correspond to temporary variables in which substrings of the match are stored during the runtime.

For a regular expression with two subgroups of the form "(\d\d)\.(\d\d)" which finds character strings of two strings of digits separated by a dot, the function returns with argument 1 the first pair of digits and with argument 2 the second pair of digits.

The function is only reasonably applicable in the context of the xsl:matching-substring subinstruction (at the moment of the processing of the pattern matches). Called up on all other occasions, it returns the empty sequence.

If a numerical value is passed on to the function to which no corresponding subgroup exists, it returns the empty string. However, the same happens if the indicated subgroup exists and contains the empty string as a match, or if it exists but has no match.

Example:

(Translated from the German data2type XSLT 2.0 reference, xsl:analyze-string element.)

Source document:

<?xml version="1.0"encoding="UTF-8"?>
<root>
    <section>
        <para>
           The [b]Extensible Markup Language[/b], abbreviated [i]XML[/i], is a markup language
           for representing hierarchically structured data in the form of text files. 
        </para>
    </section>
</root>

The source document contains a type of markup with squared brackets [] which is often used in Internet forums. The stylesheet shall convert these pseudo tags to real XML tags.

XSLT stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:output method="xhtml"/>
  <xsl:template match="section">
    <section>
      <para>
        <xsl:analyze-string select="para" regex="\[(.*?)\](.*?)\[/(.*?)\]" flags="s">
          <xsl:matching-substring>
            <xsl:element name="{regex-group(1)}">
              <xsl:value-of select="regex-group(2)"/>
            </xsl:element>
          </xsl:matching-substring>
          <xsl:non-matching-substring>
              <xsl:value-of select="."/>
          </xsl:non-matching-substring>
        </xsl:analyze-string>
      </para>
    </section>
  </xsl:template>
</xsl:stylesheet>

With the help of xsl:analyse-string, the text is analysed for the pseudo tags. The first regex-group contains the "element name" \[(.*?)\](.*?)\[/(.*?)\], the second contains the "element content" \[(.*?)\](.*?)\[/(.*?)\].

The regex-group() function is used in order to correctly generate the element and to insert the element content into the generated element.

Result:

<?xml version="1.0"encoding="UTF-8"?>
<section>
   <para>
     The <b>Extensible Markup Language</b>, abbreviated <i>XML</i>, is a markup language
     for representing hierarchically structured data in the form of text files. 
   </para>
</section>

Function definition:

XSLT 1.0:

The function is not available.

XSLT 2.0:

regex-group($group-number as xs:integer) as xs:string

   

<< back next >>

 

 

 


Copyright © Galileo Press, Bonn 2008
Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "XSLT 2.0 & XPath 2.0" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.


Galileo Press, Rheinwerkallee 4, 53227 Bonn, Germany