MathML characters

Mathematical notation makes extensive use of special characters and symbols. Any language for representing mathematics must therefore provide a way of displaying the many extended characters that are widely used in various sub-fields of mathematics. In addition, the language must be open-ended enough to allow for display of new characters and symbols that may be invented in the future.

MathML enables you to directly include most of the extended characters that are important in mathematical notation. You can include character data in a MathML document, in three different ways:

  • Type in characters directly from the keyboard. This is the most common way for entering common characters, such as those belonging to the ASCII character set.
  • Use numeric Unicode character references; for example, the letter A can be entered by typing A or A, and the Greek letter can be entered by typing α. (See XML and Unicode for more information on Unicode.)
  • Use named entity references defined in the MathML DTD. For example, you can enter the character α by typing α.

The MathML DTD includes entity declarations for over 2,500 special characters. Each entity declaration associates an entity name for a particular character with its Unicode numeric code. The various MathML characters are divided into groups according to the ISO character set they belong to (see the following table).

Table: MathML characters divided into groups according to the ISO character set.

ISO Group Description
ISOAMSA Added mathematical symbols: arrows
ISOAMSB Mathematical symbols: binary operators
ISOAMSC Mathematical symbols: delimiters
ISOAMSN Mathematical symbols: negated relations
ISOAMSO Mathematical symbols: ordinary
ISOAMSR Mathematical symbols: relations
ISOBOX Box and line drawing
ISOCYR1 Cyrillic-1
ISOCYR2 Cyrillic-2
ISODIA Diacritical marks
ISOGRK3 Greek-3
ISOLAT1 Latin-1
ISOLAT2 Latin-2
ISOMFRK Mathematical Fraktur
ISOMOPF Mathematical Openface (Double-struck)
ISOMSCR Mathematical script
ISONUM Numerical and special graphic
ISOPUB Publishing
ISOTECH General technical
MMLEXTRA Extra names added by MathML

The list of characters included in the MathML DTD is large and comprehensive enough to be sufficient for most practical purposes. However, in special cases it may be necessary to encode characters that are not defined in the DTD. For this purpose, MathML provides the mglyph element. See Character glyphs for a description of this element and an example of how it can be used to include nonstandard symbols in a MathML document.

Plane 1 characters

The characters in Unicode can be divided into two planes, called the Basic Multilingual Plane (BMP), or plane 0, and the Secondary Multilingual Plane (SMP), or plane 1. Each plane has space for 216 or 65,536 characters. Most of the code points in plane 0 have already been assigned to specific characters, while the majority of code points in plane 1 are still unassigned.

The MathML DTD includes entity references for characters in both plane 0 and plane 1. However, most browsers do not currently support the display of plane 1 characters. Hence, if you include an entity reference corresponding to a plane 1 character in a MathML document, most browsers will display a ? symbol in place of that character. As a workaround to this problem, the W3C has created an interim version of the MathML DTD in which all entity references to plane 1 characters have been replaced with references to code points that belong to the private use area of plane 0.

The private use area of plane 0 consists of code points that have not been assigned to any characters in the official Unicode standard. Therefore, specific applications can use these code points to create private encodings for characters that are needed for special purposes.

For example, 𝔄 is a MathML character that represents the Fraktur Capital A. In the MathML DTD, this reference is replaced by the character reference &x1D504;, which belongs to plane 1. However, in the modified MathML DTD, 𝔄 would be replaced by the character reference &xE504;, which belongs to the private use area of plane 0. Similarly, any plane 1 character reference of the form &x1Dnnn; will get replaced by a plane 0 reference of the form &xEnnn; instead.

The modified DTD for MathML allows you to include references to MathML characters that belong to plane 1 and have them interpreted and displayed by browsers. It is therefore preferable to use the modified DTD instead of the real MathML DTD until native support for plane 1 characters becomes available in browsers. Since most MathML content displayed in browsers will be embedded in an XHTML document, the W3C has provided a DTD that merges the DTD for XHTML and the modified DTD for MathML. To include a reference to this combined XHTML+MathML DTD, you would need to include the following declaration in your document:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd">

For more information on this topic and other issues involved in displaying MathML in Web browsers, see MathML-enabled browsers.

Fonts

Of course, it is not enough to define entity references for all extended characters that are likely to be needed for mathematical notation. If you want to be able to use these characters in a document and have them display properly in a Web browser or other rendering application, you must install on your system special fonts that contain glyphs for all the characters.

There is considerable effort underway to develop fonts for all the MathML characters. In particular, a consortium of academic organizations and technical publishers called STIX is working to develop a set of glyphs for each of the MathML characters. In the meantime, however, the number of fonts available for displaying mathematical symbols is relatively small, and the few that do exist are proprietary and are not widely distributed. The two products with the largest selection of fonts for displaying mathematical characters are the equation editor MathType and the computer algebra system Mathematica. The Computer Modern fonts, which are widely used in TeX/LaTeX documents, are another widely used set of fonts.

Non-marking characters and special constants

The majority of MathML characters are associated with specific glyphs and provide a visual representation of a particular operator, symbol, or identifier. However, a small number of MathML characters provide information about the meaning or structure of the markup in which they occur. These characters can be divided into two groups: non-marking characters and special constants.

The MathML characters listed in the following table are used to represent operators or identifiers that do not have any glyphs associated with them. They are called non-marking characters since they do not leave any visible marks when displayed in a document. There are several other non-marking characters for representing whitespace that are not listed in this table.

Table: MathML characters used to represent operators or identifiers unassociated with glyphs.

Character Name Description
&InvisibleTimes; Indicates multiplication.
&InvisibleComma; Indicates separation between indices.
&ApplyFunction; Indicates function application in presentation markup.
&Tab; Tabulator stop; horizontal tabulation.
&NewLine; Forces a line break; line feed.
&NonBreakingSpace; Space that is not a legal breakpoint.
&ZeroWidthSpace; Space of no width at all.
&VeryThinSpace; Space of width 1/18 em.
&ThinSpace; Space of width 3/18 em.
&MediumSpace; Space of width 4/18 em.
&ThickSpace; Space of width 5/18 em.

The MathML characters listed in the following table are used to represent special constants that are ordinarily represented by conventional letters. If you use the named entity reference instead of the more common letter, the specific mathematical meaning of the constant can be included in the markup.

Table: MathML characters used to represent special constants that are ordinarily represented by conventional letters.

Entity Name Description
&CapitalDifferentialD; D for use in differentials; e.g., within integrals.
&DifferentialD; d for use in differentials; e.g., within integrals.
&ExponentialE; e for use for the exponential base of the natural log.
&ImaginaryI; i for use as a square root of –1.

It is desirable to use these entity references whenever possible because they provide meaningful information that processing applications can use. This information can have specific consequences on the precise visual or aural rendering of the expression or its interpretation by a computer algebra system. For example, the characters &InvisibleTimes; and &ApplyFunction; denote the operation of multiplication and function application, respectively. Consider the following example:

fundamentals: equation 19

<math>
  <mrow>
    <mi>f</mi>
    <mo>&ApplyFunction;</mo>
    <mrow>
      <mo>(</mo>
      <mrow>
        <mi>x</mi>
        <mo>&InvisibleTimes;</mo>
        <mi>y</mi>
      </mrow>
      <mo>)</mo>
    </mrow>
  </mrow>
</math>

This markup may be spoken in an audio rendering system as "f of x times y." This rendering is more faithful to the meaning of the markup than just "f x y", which is how the markup might be spoken if the named entity references were omitted.

   

<< back next >>

 

 

 


 

Copyright © CHARLES RIVER MEDIA, INC., Massachusetts (USA) 2003
Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "The MathML Handbook" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.


CHARLES RIVER MEDIA, INC., 20 Downer Avenue, Suite 3, Hingham, Massachusetts 02043, United States of America