# MathML and XML

(Excerpt from "The MathML Handbook" by Pavi Sandhu)

MathML is an application of XML. This means that the syntax of MathML — that is, its rules for using elements and attributes — is determined by the rules of XML. The vocabulary of MathML (the elements and attributes allowed) is determined by an XML DTD.

Each instance of MathML markup consists of Unicode characters organized into a nested tree of elements. MathML is case sensitive, but all element and attribute names are defined in lowercase for simplicity. MathML includes both container elements, such as mrow or apply, and empty elements, such as mspace or sin.

As explained under Elements and attributes, a container element consists of a start tag, an end tag, and the content included between them. An empty element consists of a single start tag, and has no content or end tag. Strictly speaking, XML makes a clear distinction between elements and tags. However, it is customary to refer to the mrow element, for example, to mean the element whose start tag is mrow. This convention is followed throughout this book.

MathML also contains some **additional syntax rules** that go beyond those specified by XML. These additional rules fall into two categories:

- MathML places restrictions on the type of value that certain attributes can take. For example, the content token element, cn, which is used to represent numbers has an attribute called base, which specifies the base of the number encoded. This attribute can take integer values between 2 and 36 only.
- MathML places restrictions on the number of child elements of certain elements and assigns a special meaning to those child elements based on their order. Child elements of this type are called
*arguments*. For example, the msup element always has two arguments; the first one is interpreted as the base and the other as the superscript.

XML syntax does not provide a way of specifying these two types of constraints. Hence, these rules are not specified in the MathML DTD and an XML processor does not recognize their violation as an error. However, violation of these rules is a MathML error and will be recognized as such by applications that process MathML.

## The Root Element

We saw earlier that every XML document must have exactly one root element. For a MathML document, the root element must be a math element. In addition, whenever a MathML document is embedded in another XML or HTML document, it is a good practice to declare the namespace of the math element. This ensures that the math element as well as all other elements contained within it are recognized as being MathML elements.

Every example of MathML markup must be enclosed by a single top-level math element. This means that if you copy part of a MathML expression from an existing MathML document, you must add an outer math element to the expression before it can be used as a valid piece of free-standing MathML markup. Conversely, if you take a MathML expression and paste it into an existing MathML document, the outer math tags of the expression being pasted must be removed, so that the destination document does not contain more than one math element. This behavior is automatically built into all applications for copying, pasting, and processing MathML, such as equation editors.

The most important attribute of the math element is display. This can take two values, inline or block. The default setting display="inline" is suitable for equations that are to be displayed inside a paragraph of text. With this setting, some operators such as the integral and summation symbols are shown in a smaller size. In addition, their limits are shown as subscripts and superscripts, so the equation takes up less room vertically. The setting display="block" is used when equations are to be displayed in a separate line, by themselves. With this setting, symbols such as integral and summation signs are shown in a larger size and their limits are shown as underscripts and overscripts.

For MathML equations to be integrated into a Web page, the MathML markup must be inserted at appropriate places in the HTML (or XHTML) document that defines the Web page. When the browser processes the HTML document and comes across the MathML islands, it either renders them directly or passes them to the plug-in for processing. See Combining presentation and content markup for detailed information on how to embed MathML equations in an HTML for display by specific browsers and plug-ins.

At present, two browsers — *Amaya* and *Mozilla* — can natively display MathML. *Amaya* is the test browser provided by the W3C for testing new Web technologies. It is available for Windows and various Unix platforms but not for Macintosh. *Mozilla* is an open-source browser that will serve as the basis for the next version of *Netscape*. It is available for all major platforms.

*IE* and *Netscape* support the display of MathML using special add-on software, such as IBM's *techexplorer* or Design Science's *MathPlayer* or *WebEQ*.

For details of how to embed MathML markup in an HTML document and information on configuring specific browsers to view MathML, see Applying styles and transformations.

<< back | next >> |

**Copyright © CHARLES RIVER MEDIA, INC., Massachusetts (USA) 2003**

Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "The MathML Handbook" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.

CHARLES RIVER MEDIA, INC., 20 Downer Avenue, Suite 3, Hingham, Massachusetts 02043, United States of America