(Excerpt from "The MathML Handbook" by Pavi Sandhu)

TeX4ht, developed by Eitan Gurari of Ohio State University, is a powerful and versatile system for converting TeX documents into HTML or XML formats. In its default mode, TeX4ht converts a TeX or LaTeX document into HTML with all mathematical formulas saved as images. However, TeX4ht can be readily configured to produce other document types such as XHTML, DocBook, or Text Encoding Initiative (TEI). It can also convert formulas present in the original document into MathML instead of images.

The TeX4ht system has two main components: a set of style files and a postprocessor. The process of converting a TeX document takes place in two stages: first, TeX processes the original document using the style files provided by TeX4ht. The result is a DVI file that contains "hooks" or special instructions meant for the TeX4ht postprocessor. In the next stage, the postprocessor acts on the DVI file and interprets the hooks in the file to produce the final output.

Since TeX itself handles the conversion from the original TeX document to the DVI file, TeX4ht has access to the full power of TeX for typesetting the document. In particular, TeX4ht can use TeX's capabilities for handling fonts, macros, variables, and so on to control the form of the output. TeX4ht can also handle most user-defined macros that occur in a LaTeX document.

TeX4ht has a number of nice features. You can place separate sections of the LaTeX document on separate Web pages, with appropriate hyperlinks connecting them. You can also create HTML versions of tables of contents, bibliographies, and so on. The LaTeX Web Companion by Michel Goossens and Sebastian Rahtz provides detailed instructions on customizing the output of TeX4ht.

For converting TeX documents into XHTML+MathML documents, you do not have to customize TeX4ht yourself, since most of the work has already been done. Paul Gartside of the University of Pittsburgh has created a modified form of TeX4ht called TeX4moz. This contains some additional scripts and configuration files that customize the output of TeX4ht to produce XHTML+MathML files that can be displayed by Mozilla.

Installing TeX4ht

Since TeX4ht uses TeX to handle the first stage of processing the source document, you must have a working installation of TeX already present on your system. If you do not already have TeX installed, you can download all the relevant files from the TeX User's Group Web site. This site also contains a wealth of information on all aspects of TeX, ranging from tutorials for beginners to specialized information for more advanced users.

On Windows

To install TeX4moz on Windows, follow these steps:

  1. Download TeX4moz. (Note: The download website is no longer available!)
  2. Create a directory called c:\tex4ht and unzip all the files into this directory.
  3. Modify the files tex4ht.env and moz4ht.env by editing the lines starting with tc:\path\tfm! to specify the directories in which the tfm files of TeX are located on your machine. For example, if you are using MikTeX, which has tfm files in c:/texmf/fonts/tfm, change the above line in each .env file to tc:\texmf\fonts\tfm\!. The ! at the end of the line indicates that TeX4ht should search all subdirectories of the specified path for the font metric files.
  4. Rename the htlatex.tab, httex.tab, mztex.tab, and mzlatex.tab files to change the file extension from .tab to .bat.
  5. Add the c:\tex4ht directory to your path. To do this on Windows 2000/XP, open the System control panel, click the Advanced tab, click the Environment Variables button, select Path in the list of system variables, and click Edit. In the dialog that comes up, add c:\tex4ht as one of the values of the Path variable, and then click OK.
  6. Move tex4ht.sty and all the '.4ht' files to the c:\tex4ht directory. Alternatively, you can modify the environment variable TEXINPUTS to point to c:\tex4ht, using the same procedure outlined in Step 5.

On Unix

There are two ways of installing TeX4ht on Unix. You can install it in your local directory, in which case it is available for use only by you. Alternatively, if you have root access, you can do a root installation, in which case the program will be available to all users who have access to that machine. The installation on Unix requires the following steps:

  1. Download the archive that contains the package files.
  2. Untar and decompress the archive.
  3. Run the installer.

For a local installation, one additional step is required. You need to modify the value of the environment variables PATH and TEXINPUTS so that they point to the directory in which the TeX4moz files are installed. You can change the value of these variables by editing your configuration file.

Unlike on Windows, there is no need to change the file extensions of any files.

Running TeX4ht

To process a document using TeX4ht, you run a command of the following form:

mzlatex filename

The output file is specially optimized for viewing in Mozilla. It is an XHTML file, contains a DOCTYPE declaration to the XHTML DTD, and has a .xml file extension. Hence, this file cannot be rendered in IE. However, you can easily modify the file so it is viewable in IE using either MathPlayer or IBM techexplorer. Just add a statement that references the Universal MathML stylesheet.

Let us look at an example of using TeX4ht to translate a TeX document into XHTML+MathML. The following example shows a LaTeX document that contains some mathematical formulas.

Example: A LaTeX document called article.tex that contains inline and display equations.

\title{Electronic Structure of a Two-Dimensional Metal}
The effect of the magnetic field can be included in the electronic structure calculation by using the Peierls substitution $$ t_l\rightarrow t_l e^{i{2\pi \over \phi_o}\int_{i,j}^{i',j'} {\bf A} \cdot d{\bf l}}$$ where $\phi_o=hc/e$ and $\bf A$ are the flux quantum and the vector potential, respectively.  
For simplicity, we choose the Landau gauge ${\bf A}=-B(y,0,0)$. By following a standard procedure, we rewrite the Hamiltonian as a function of magnetic field in {\bf k}-space. It is straightforward to compute the thermodynamic quantities from the field $$ \Omega = -{2 \over \beta} \sum_{i=1}^{4 \tilde q} \sum_{\bf k} {\rm ln} [1+e^{-\beta (E_i({\bf k})-\mu)}]$$ where $\beta$, $E_i({\bf k})$ and $\mu$ denote the inverse temperature, the dispersion relation of the $i$-th magnetic subband and the chemical potential, respectively.  
The field dependence of the chemical potential is calculated by inverting the constraint equation for occupation $$ N = 2 \sum_{i=1}^{4\tilde q} \sum_{\bf k} {1 \over e^{\beta (E_i({\bf k})-\mu)}+1} $$ where the factor 2 comes from the spin degeneracy. Because there are six electrons per unit cell distributed among four bands at zero magnetic field, the total occupancy factor $N/N_{max}$ is 3/4 where $N_{max} = 2\sum_{\bf k}\sum_{i=1}^{4\tilde q}1$. Once the chemical potential and thermodynamic potential are calculated as a function of fields, it is straightforward to compute the magnetization $M=-dF/dB$ from the free energy $F=\Omega+\mu N$.

To process this document using TeX4moz, run the following command:

mzlatex article.tex

Several auxiliary files are created in the same directory as the input file and a large number of messages are displayed on the screen, just like when you are processing the document using TeX. This is, of course, because TeX4ht itself calls TeX. Once the TeX processing is over, the final output document called article.xml is created. This is an XHTML+MathML document that contains the appropriate DOCTYPE declarations needed so it can be viewed in Mozilla. The following figure shows how article.xml looks when viewed in Mozilla.

Converting article.tex into XHTML+MathML using TeX4ht

Figure: Converting the LaTeX document article.tex into XHTML+MathML using TeX4ht.

Compare this with the output produced by processing the same input document using LaTeX. This is shown in the next figure. You can see that the quality of the rendering produced by Mozilla is comparable to that of the TeX output.

TEX4ht: produced DVI file by processing article.tex using TeX

Figure: The DVI file produced by processing article.tex using TeX.


<< back next >>





Copyright © CHARLES RIVER MEDIA, INC., Massachusetts (USA) 2003
Printing of the online version is permitted exclusively for private use. Otherwise this chapter from the book "The MathML Handbook" is subject to the same provisions as those applicable for the hardcover edition: The work including all its components is protected by copyright. All rights reserved, including reproduction, translation, microfilming as well as storage and processing in electronic systems.

CHARLES RIVER MEDIA, INC., 20 Downer Avenue, Suite 3, Hingham, Massachusetts 02043, United States of America