Inline elements

Text formatting

Inline elements, formattings and the text are within the run text element <w:r>. The <w:r> element has a whole range of child elements. In the DTD syntax, these child elements can be as follows:

(rPr?, (aml:annotation | br | t | delText | instrText | delInstrText | noBreakHyphen | 
  softHyphen | annotationRef | footnoteRef | endnoteRef | separator | 
  continuationSeparator | footnote | endnote | sym | pgNum | cr | tab | pict | 
  fldChar | ruby | wx:t+))

Here, the <w:rPr> element takes on a similar function as the <w:pPr> element for the paragraphs. As a container element, its function is to bundle all formatting properties. In this way, styles can be assigned and also local formattings can be determined.

In the following example, styles as well as local formatting properties have been used; both inline markups look the same:

...
   <w:style w:type="character" w:styleId="Hervorhebung">   (2)   
     <w:name w:val="Emphasis"/>       
     <w:basedOn w:val="Absatz-Standardschriftart"/>       
     <w:rsid w:val="00810D0D"/>       
     <w:rPr>         
       <w:i/> <w:i-cs/>
     </w:rPr>     
   </w:style>   
</w:styles>
 ...
<w:p>
   <w:r>
     <w:t>Dies ist ein </w:t> <!-- en: This is a -->  (4)
   </w:r>
   <w:r>
     <w:rPr>                                 (1)
       <w:rStyle w:val="Hervorhebung"/>      (2)
     </w:rPr>
     <w:t>Absatz</w:t> <!-- paragraph -->
   </w:r>
   <w:r>
     <w:t> mit </w:t> <!-- with -->
   </w:r>
   <w:r>
     <w:rPr>                                 (1)
       <w:i/>                                (3)
     </w:rPr>
     <w:t>inzeiligen</w:t> <!-- inline -->
   </w:r>
   <w:r>
     <w:t> Auszeichnungen.</w:t> <!-- markups. -->    (4)
   </w:r>
</w:p>

(1) The <w:rPr> element defines the inline properties of the following text.

(2) The <w:rStyle> element assigns an inline style to the run text. The w:val attribute contains the RefID for this style.

(3) With the help of the <w:i> element, the text is italicized. The element is used when the K button is clicked in the formatting toolbar.

(4) Texts without inline and paragraph formatting are formatted with the Standard style.

image - inline formatting

Figure: inline formatting

The formatting properties of the run text essentially correspond to the setting options of the character window. You can access this menu by Format —> Zeichen (characters).

image - character window

Figure: the character window

The <w:rPr> element has a range of child elements. These elements can be as follows in the DTD syntax:

(rStyle | rFonts | wx:font | wx:sym | b | b-cs | i | i-cs | caps | smallCaps | 
  strike | dstrike | outline | shadow | emboss | imprint | noProof | 
  snapToGrid | vanish | webHidden | color | spacing | w | kern | position | 
  sz | sz-cs | highlight | u | effect | bdr | shd | fitText | vertAlign | 
  rtl | cs | em | hyphen | lang | asianLayout | specVanish | aml:annotation)+
<w:p>
   <w:r>
     <w:t>In dieser Zeile kommen</w:t> <!-- en: This line is -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:b/>                          (1)
     </w:rPr>
     <w:t>fett</w:t> <!-- bold -->
   </w:r>
   <w:r>
     <w:t>, </w:t>
   </w:r>
   <w:r>
     <w:rPr>
       <w:i/>                          (2)
     </w:rPr>
     <w:t>kursiv</w:t> <!-- in italics -->
   </w:r>
   <w:r>
     <w:t> und </w:t> <!-- and -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:u w:val="single"/>           (3)
     </w:rPr>
     <w:t>unterstrichen</w:t> <!-- underlined -->
   </w:r>
   <w:r>
     <w:t> vor. Alles </w:t> <!-- All at the -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:b/>
       <w:i/>
       <w:u w:val="single"/>
     </w:rPr>
     <w:t>gleichzeitig.</w:t> <!-- same time. -->
   </w:r>
</w:p>

(1) With the help of the <w:b> element, the text is formatted in bold. The element is used when the F (b) button is clicked in the formatting toolbar or the appropriate feature has been selected in the font window. It may contain the w:val attribute which may be on and off and in this way it may activate or deactivate the highlighting of the text by bold print.

(2) With the help of the <w:i> element, the text is italicized. The element is used when the K (i) button is clicked in the formatting toolbar or the appropriate feature has been selected in the character window. It may, like the <w:b> element, contain a w:val attribute which may be on and off and in this way activates or deactivates the italic type.

(3) With the help of the <w:u> element, the text is underlined. The element is used when the U button is clicked in the formatting toolbar or the appropriate feature has been selected in the character window. It may contain the w:val attribute which may have the following underline styles: single, words, double, thick, dotted, dotted-heavy, dash, dashed-heavy, dash-long, dash-long-heavy, dot-dash, dash-dot-heavy, dot-dot-dash, dash-dot-dot-heavy, wave, wavy-heavy, wavy-double or none. The values determine the type of underlining. If the attribute is not used, the text will not be underlined. The optional color attribute determines the colour of the underlined text as a RGB hexadecimal value.

image - inline properties

Figure: inline properties

<w:p>
   <w:r>
     <w:t>Eine andere Schriftart wie z.B. </w:t> <!-- en: A different font such as -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:rFonts w:ascii="Courier New" w:h-ansi="Courier New" w:cs="Courier New"/> (1)
       <wx:font wx:val="Courier New"/>             (1)
     </w:rPr>
     <w:t>Courier New</w:t>
   </w:r>
   <w:r>
     <w:t>, eine andere </w:t> <!-- , a different -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:sz w:val="28"/>                          (2)
     </w:rPr>
     <w:t>Schriftgröße, </w:t> <!-- font size, -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:vertAlign w:val="superscript"/>          (3)
     </w:rPr>
     <w:t>hoch gestellt und </w:t> <!-- superscripted and -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:vertAlign w:val="subscript"/>            (3)
     </w:rPr>
     <w:t>tief gestellt </w:t> <!-- subscripted -->
   </w:r>
   <w:r>
     <w:t>und in einer anderen </w:t> <!-- and in a different -->
   </w:r>
   <w:r>
     <w:rPr>
       <w:color w:val="FF0000"/>                   (4)
     </w:rPr>
     <w:t>Farbe.</w:t> <!-- colour. -->
   </w:r>
</w:p>

(1) With the help of the <w:rFonts> element, a font for a running text can be specified. The three attributes used in this example are automatically set by Word 2003. They indicate the font to be used in different settings. Only the w:ascii attribute is processed in this context. If characters are used outside the ASCII code (such as umlauts), the w:h-ansi attribute would be addtionally processed. In order to maintain a clear overview, all attributes should be set in the manual or stylesheet-controlled generation of WordML documents. This also applies to the <wx:font> element which provides Word 2003 indications on the font to be used.

(2) The <w:sz> element is used to determine the font size. Its attribute w:val indicates the font size in double point size.

(3) With the <w:vertAlign> element, text can be superscripted and subscripted. The w:val attribute has three forms: baseline, superscript und subscript. The superscript value superscripts the text, the subscript value subscripts the text and the baseline value puts the text on the baseline. Silmultaneously, the font size is automatically shrinked.

(4) By using the <w:color> element, the text can have a different colour. For its <w:val> attribute a RGB hexadecimal value has to be indicated for the determination of the colour.

Line breaks and tabulators

In WordML the two elements <w:tab> and <w:br> are used to create line breaks and tabulators. The following example shows the respective manner of use for the elements.

<w:docPr>
<w:defaultTabStop w:val="708"/>           (2)
</w:docPr>
 ...
<w:p>
   <w:r>
     <w:t>Hier</w:t> <!-- en: Here -->
   </w:r>
   <w:r>
     <w:tab/>                             (1)
     <w:t>befinden</w:t> <!-- you can -->
   </w:r>
   <w:r>
     <w:tab/>
     <w:t>sich</w:t> <!-- find -->
   </w:r>
   <w:r>
     <w:tab />
     <w:t>Tabulatoren</w:t> <!-- tabulators -->
   </w:r>
   <w:r>
     <w:br/>                              (3)
     <w:t>und</w:t> <!-- and -->
   </w:r>
   <w:r>
     <w:tab/><w:t>Umbrüche.</w:t> <!-- line breaks. -->
   </w:r>
<w:p>

image - tabulators

Figure: tabulators

(1) The <w:tab> element generates a tab stop. The element has three possible attributes which originate from the auxiliary namespace wx. The attributes wx:wTab, wx:tlc and wx:cTlc give information about the position of the text in a line. They cannot be generated manually since information about line breaks are normally not given when working manually with WordML. These attributes are not processed by Word 2003.

(2) The default spaces for tab stops are determined by the <w:defaultTabStop> element. Its attribute w:val contains twips (twentieths of a point) as measurements.

(3) The <w:br> element generates a line break. It has an optional type attribute with the possible values "page" (for page breaks), "column" (for column breaks) and "text-wrapping" (the default value for line breaks). In Word a line break can be generated with the key combination Shift + Return.

In the above example, the default tab stop widths of the Word document have been used. These can also be defined locally for each paragraph.

In Word you can find a user interface which can be accessed via Format –> Absatz (paragraph)–> Tappstopps (Tabs).

image - tab stops

Figure: how to set tabulators

<w:p>
   <w:pPr>
     <w:tabs>                                               (1)
       <w:tab w:val="left" w:pos="1080"/>                   (1)
       <w:tab w:val="left" w:pos="1980"/>                   (1)
       <w:tab w:val="left" w:pos="3780"/>                   (1)
       <w:tab w:val="left" w:pos="4680"/>                   (1)
     </w:tabs>
   </w:pPr>
   <w:r>
     <w:t>Hier</w:t>
   </w:r>
   <w:r>
     <w:tab wx:wTab="645" wx:tlc="none" wx:cTlc="8"/>       (2)
     <w:t>befinden</w:t>
   </w:r>
   <w:r>
     <w:tab wx:wTab="1770" wx:tlc="none" wx:cTlc="23"/>     (2)
     <w:t>sich</w:t>
   </w:r>
   <w:r>
     <w:tab wx:wTab="480" wx:tlc="none" wx:cTlc="5"/>       (2)
     <w:t>Tabulatoren</w:t>
   </w:r>
   <w:r>
     <w:br/>
     <w:t>und</w:t>
   </w:r>
   <w:r>
     <w:tab/>
     <w:t>Umbrüche.</w:t>
   </w:r>
</w:p>

image - tabulators

Figure: tabulators

(1) The <w:tabs> element is a container element for the definition of the tabulator widths; these are determined in a <w:tab> element. The w:val attribute determines the alignment of the text which follows the tab stop. The attribute may adopt the values clear, left, center, right, decimal, bar and list.

The w:pos attribute determines the position of the tabulators in twips (twentieths of a point). By using the optional w:leader attribute, the formatting of the space between tab stops can be changed. The possible values are none (also the default value in case the attribute is not used), dot (for dotted lines), hyphen (for dashed lines), underscore (for subscripted solid lines), heavy (for subscripted semi-bold solid lines) and middle-dot (for dotted lines in medium height).

(2) The attributes wx:wTab, wx:tlc and wx:cTlc are generated by Word 2003 when saving, but not processed when loading the document. They are used to determine the position of the text in the document.

image - tabulator lines

Figure: tabulator lines

The <w:t> element

The <w:t> element contains the textual content of a WordML document. Space characters are adopted; possible line breaks in WordML are transformed into space characters by Word. The <w:t> element has no attributes and no child elements.

Unfortunately, the textual content is sometimes cut into pieces by Word 2003. So it may happen that several <w:r> elements with text are generated without being necessary because of a modified formatting. The two following paragraphs have an equivalent content:

<w:p>
   <w:r>
     <w:t>Word fragments</w:t>
   </w:r>
</w:p>

<w:p>
   <w:r>
     <w:t>Wor</w:t>
   </w:r>
   <w:r>
     <w:t>d</w:t>
   </w:r>
   <w:r>
     <w:t>fra</w:t>
   </w:r>
   <w:r>
     <w:t>gm</w:t>
   </w:r>
   <w:r>
     <w:t>e</w:t>
   </w:r>
   <w:r>
     <w:t>n</w:t>
   </w:r>
   <w:r>
     <w:t>ts</w:t>
   </w:r>
</w:p>

Copyright © dpunkt.verlag GmbH 2007
Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "Professionelle XML-Verarbeitung mit Word" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.

dpunkt.verlag GmbH, Ringstraße 19B, 69115 Heidelberg, fon 06221-14830, fax 06221-148399, hallo(at)dpunkt.de