Elements and Attributes

(Excerpt from "The MathML Handbook" by Pavi Sandhu)

An XML document consists of text organized into one or more elements. There are two types of elements: container elements and empty elements. Here is a simple XML document consisting of a single container element:

<author>Mark Twain</author>

Each container element consists of a start tag followed by some data followed by an end tag. The data enclosed between the start and end tags is called the element's content. The start and end tags consist of angle brackets enclosing the name of the element. This syntax is similar to that used in HTML. However, unlike in HTML, XML element names are case sensitive. Hence, the element title is different from Title and TITLE. Element names can contain any number of letters, numbers, underscores, hyphens, or periods but they must start with a letter or underscore.

The second type of XML element is called a canonically empty element, because it cannot contain any content. An empty element has the syntax <blank/>. This can also be written in the equivalent form <blank></blank>; that is, like a container element without any content. Both forms can be used interchangeably, but the single-tag form, <blank/>, is more common, since it is more compact and it emphasizes that this is an empty element.

Although they do not contain any data, empty elements can still provide useful information based on the position in which they occur and the value of their attributes.

Each element can have one or more attributes. An attribute is a parameter that describes some property of the element in which it occurs. The attributes for an element are always specified in the start tag after the element name, as shown in this example:

<author country="U.S.">Mark Twain</author>

The value of an attribute is always enclosed in either single quotation marks or double quotation marks. Here the attribute country has the value "U.S.". Attributes can be set up to have a default value that is automatically assumed if you do not explicitly specify a value.

An element can have any number of attributes as long as each attribute has a unique name. Here is an example of an element with three attributes:

<point xcoord="1.2" ycoord="3.1" zcoord="6.0">

When you are constructing an XML document, you can often describe the same information either by using an element or an attribute. For instance, the information in the three-element example can also be described in XML in the following way:

<point>
  <xcoord>1.2</xcoord>
  <ycoord>3.1</ycoord>
  <zcoord>6.0</zcoord>
</point>

Here, information about each coordinate is given in a separate element instead of as an attribute to the point element. There is no hard and fast rule for deciding when to use elements or attributes. In general, your own taste and judgment mainly determine whether you choose one or the other. However, elements are usually preferable in the following two situations:

  • When you are encoding a parameter that can take multiple values such as a person's name, phone number, or occupation. Attributes are unsuitable for this purpose, since a given attribute can take only one value.
  • When you are encoding information that has a complex structure, such as a name. This is because the substructure of the name (that is, the first name, last name, and middle initial) can be encoded by additional elements. It is not possible to do this using attributes, since an attribute value can only be a simple text string.

The content of an element can be character data, other XML elements, or a mixture of the two. Here is an example of an XML document showing an element that contains other elements:

<book>
  <title>Huckleberry Finn</title>
  <author>Mark Twain</author>
</book>

We say that the book element is the parent of the title and author elements. Conversely, the title and author elements are called children of the book element and siblings of each other. XML documents have a tree structure with each element corresponding to a single node of the tree. The root of the document tree in the above example is the book element, and the branches of the tree are the author and title elements.

Each XML document must have exactly one root element. However, the tree can contain any number of branches, nested however deep you like. Here is a slightly more complicated XML document, with three levels of parent-child relationships:

<library>
  <book>
    <title>Huckleberry Finn</title>
    <author>Mark Twain
      <born>1835</born>
      <died>1910</died>
    </author>
  </book>
  <book>
    <title>Moby Dick</title>
    <author>Herman Melville
      <born>1819</born>
      <died>1891</died>
    </author>
  </book>
</library>

   

<< back next >>

 

 

 


 

Copyright © CHARLES RIVER MEDIA, INC., Massachusetts (USA) 2003
Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "The MathML Handbook" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.


CHARLES RIVER MEDIA, INC., 20 Downer Avenue, Suite 3, Hingham, Massachusetts 02043, United States of America