I usually $\LaTeX$ lecture notes in my classes, both those I take and those I teach. (If you’re a coder and you write $\LaTeX$ for absolutely everything you do in class for five years or so, you get really fast at it.) I’ve yet to find a good $\LaTeX\rightarrow\text{HTML}$ converter, and I’ve tried several. There are good reasons for this, the least not being that the markups of these languages are different enough to cause a lot of difficulty in the translation.

With MathJax (of which I am a huge fan) allowing for specific bits of $\LaTeX$ to be entered directly in to web sources via scripts, a lot of the problem is solved. But, I still want to be able to include parts and packages of $\LaTeX$ not supported by MathJax, such as PGF and TikZ.

I’m going to approach this from a modular perspective, bit by bit over time. The first thing that needs to be done is I need to write a converter that will form *.svg files from graphics created via pdflatex. I’m going to write this in Python, partially because it is well-suited to the task and partially because it will be fast to develop. That’s what this post will develop. After that is done, I’ll need to parse $\LaTeX$ code into three groups: (1) Text that can be formatted in HTML, (2) $\LaTeX$ commands that can transfer directly to MathJax, and (3) $\LaTeX$ commands that will need to be converted to *.svg files.

Dependencies

I’m writing this a bit selfishly for myself. I use Debian based Linux distributions and I’ll need the following to be callable in my path: pdflatex, pdfcrop, pdf2svg, standard BASH commands such as ‘rm’.

Main Idea

The idea is that each bit of $\LaTeX$ in my finished document that is not text and cannot be given to MathJax should be stripped from my document and placed on a separate page. Each page is then written as a pdf, then cropped (the page style is empty, so there are no page numbers and so on) to the bounding box of the generated image, then saved as a vector format svg file. Once complete, the images are hooked back into the HTML document among the MathJax code.

The Code

The Python code for the program latex2img is as follows.

#!/usr/bin/env python   ############################################################################### # Jason B. Hill (Jason.B.Hill@Colorado.edu) ############################################################################### # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/> ###############################################################################   import sys # used to parse command line options import subprocess # used to execute shell commands / obtain exit status import re # used to format filenames   # version numbers are printed in tenths latex2img_version = 0.1 # 9-8-11 Initial coding   ############################################################################### # Define functions ###############################################################################   def print_intro(version): print "################################################" print "# latex2img version %.1f" % version print "# Jason B. Hill (www.jasonbhill.com)" print "################################################"   def read_latex(file): lines = [] try: FILE = open(file, "r") except: print "Error: file read error" sys.exit(0) for blob in FILE: lines.append(blob) return lines   def write_latex(file,latex_data): try: FILE = open(file, "w") except: print "Error: file write error" sys.exit(0) FILE.write("\\documentclass[letterpaper,11pt]{article}\n") FILE.write("\\usepackage{fullpage,amsfonts,amsmath,amssymb,tikz}\n\n") FILE.write("\\pagestyle{empty}\n\n") FILE.write("\\begin{document}\n\n") #FILE.write("\$\n") for blob in latex_data: FILE.write(blob) #FILE.write("\$\n\n") FILE.write("\\end{document}\n") FILE.close()   ############################################################################### # Parse options and raise errors if needed ###############################################################################   if len(sys.argv) < 2: print "Error: Too few options passed." print "Usage: latex2img file[.tex]" sys.exit(0) elif len(sys.argv) > 2: print "Error: Too many options passed." print "Usage: latex2img file[.tex]" sys.exit(0) try: file_path = sys.argv[1] FILE = open(file_path,"r") except: print "Error: File does not exist or is not readable." % file_path sys.exit(0)   ############################################################################### # Execute ###############################################################################   ### Set filenames p = re.compile("^(.*)\.(tex|tikz)$", re.IGNORECASE) m = p.match(file_path) if m: tex_filename = "tmp.%s.tex" % m.group(1) pdf_filename = "tmp.%s.pdf" % m.group(1) svg_filename = "%s.svg" % m.group(1) temp_path = "tmp.%s*" % m.group(1) else: tex_filename = "tmp.%s.tex" % file_path pdf_filename = "tmo.%s.pdf" % file_path svg_filename = "%s.svg" % file_path temp_path = "tmp.%s*" % file_path ### Print information at shell print_intro(latex2img_version) ### Read tikz info and place in a LaTeX file latex_data = read_latex(file_path) write_latex(tex_filename,latex_data) ### Execute pdflatex command = "pdflatex -shell-escape %s" % tex_filename print "Executing command: " + command process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE) process.wait() if process.returncode is 0: print "(success) " + command ### Execute pdfcrop command = "pdfcrop %s %s" % (pdf_filename,pdf_filename) print "Executing command: " + command process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE) process.wait() if process.returncode is 0: print "(success) " + command ### Execute pdf2svg command = "pdf2svg %s %s" % (pdf_filename,svg_filename) print "Executing command: " + command process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE) process.wait() if process.returncode is 0: print "(success) " + command ### Remove temporary files command = "rm %s" % temp_path process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE) process.wait() if process.returncode is 0: print "(success) " + command + " (temp files removed)" Example I’ll create a file named dynkin.tex somewhere in a directory with appropriate permissions. The file looks like this. (Notice that there are no$\LaTeX$headers/footers or other definitions in the file.) $\begin{array}{|c|c|c|} \hline \textbf{Type} & \textbf{Cartan Matrix} & \textbf{Dynkin Diagram}\\ \hline \hline & & \\ E_6 & \left( \begin{array}{rrrrrr} 2 & 0 & -1 & 0 & 0 & 0\\ 0 & 2 & 0 & -1 & 0 & 0\\ -1 & 0 & 2 & -1 & 0 & 0\\ 0 & -1 & -1 & 2 & -1 & 0\\ 0 & 0 & 0 & -1 & 2 & -1\\ 0 & 0 & 0 & 0 & -1 & 2 \end{array} \right) & \begin{tikzpicture}[scale=0.7] \draw (1,0) -- (5,0); \draw (3,0) -- (3,1); \filldraw[color=black] (1,0) circle (5pt); \filldraw[color=white] (1,0) circle (4pt); \filldraw[color=black] (2,0) circle (5pt); \filldraw[color=white] (2,0) circle (4pt); \filldraw[color=black] (3,0) circle (5pt); \filldraw[color=white] (3,0) circle (4pt); \filldraw[color=black] (4,0) circle (5pt); \filldraw[color=white] (4,0) circle (4pt); \filldraw[color=black] (5,0) circle (5pt); \filldraw[color=white] (5,0) circle (4pt); \filldraw[color=black] (3,1) circle (5pt); \filldraw[color=white] (3,1) circle (4pt); \end{tikzpicture}\\ & & \\ \hline & & \\ E_7 & \left( \begin{array}{rrrrrrr} 2 & 0 & -1 & 0 & 0 & 0 & 0\\ 0 & 2 & 0 & -1 & 0 & 0 & 0\\ -1 & 0 & 2 & -1 & 0 & 0 & 0\\ 0 & -1 & -1 & 2 & -1 & 0 & 0\\ 0 & 0 & 0 & -1 & 2 & -1 & 0\\ 0 & 0 & 0 & 0 & -1 & 2 & -1\\ 0 & 0 & 0 & 0 & 0 & -1 & 2 \end{array} \right) & \begin{tikzpicture}[scale=0.7] \draw (1,0) -- (6,0); \draw (3,0) -- (3,1); \filldraw[color=black] (1,0) circle (5pt); \filldraw[color=white] (1,0) circle (4pt); \filldraw[color=black] (2,0) circle (5pt); \filldraw[color=white] (2,0) circle (4pt); \filldraw[color=black] (3,0) circle (5pt); \filldraw[color=white] (3,0) circle (4pt); \filldraw[color=black] (4,0) circle (5pt); \filldraw[color=white] (4,0) circle (4pt); \filldraw[color=black] (5,0) circle (5pt); \filldraw[color=white] (5,0) circle (4pt); \filldraw[color=black] (6,0) circle (5pt); \filldraw[color=white] (6,0) circle (4pt); \filldraw[color=black] (3,1) circle (5pt); \filldraw[color=white] (3,1) circle (4pt); \end{tikzpicture}\\ & & \\ \hline & & \\ E_8 & \left( \begin{array}{rrrrrrrr} 2 & 0 & -1 & 0 & 0 & 0 & 0 & 0\\ 0 & 2 & 0 & -1 & 0 & 0 & 0 & 0\\ -1 & 0 & 2 & -1 & 0 & 0 & 0 & 0\\ 0 & -1 & -1 & 2 & -1 & 0 & 0 & 0\\ 0 & 0 & 0 & -1 & 2 & -1 & 0 & 0\\ 0 & 0 & 0 & 0 & -1 & 2 & -1 & 0\\ 0 & 0 & 0 & 0 & 0 & -1 & 2 & -1\\ 0 & 0 & 0 & 0 & 0 & 0 & -1 & 2 \end{array} \right) & \begin{tikzpicture}[scale=0.7] \draw (1,0) -- (7,0); \draw (3,0) -- (3,1); \filldraw[color=black] (1,0) circle (5pt); \filldraw[color=white] (1,0) circle (4pt); \filldraw[color=black] (2,0) circle (5pt); \filldraw[color=white] (2,0) circle (4pt); \filldraw[color=black] (3,0) circle (5pt); \filldraw[color=white] (3,0) circle (4pt); \filldraw[color=black] (4,0) circle (5pt); \filldraw[color=white] (4,0) circle (4pt); \filldraw[color=black] (5,0) circle (5pt); \filldraw[color=white] (5,0) circle (4pt); \filldraw[color=black] (6,0) circle (5pt); \filldraw[color=white] (6,0) circle (4pt); \filldraw[color=black] (7,0) circle (5pt); \filldraw[color=white] (7,0) circle (4pt); \filldraw[color=black] (3,1) circle (5pt); \filldraw[color=white] (3,1) circle (4pt); \end{tikzpicture}\\ & & \\ \hline & & \\ F_4 & \left( \begin{array}{rrrr} 2 & -1 & 0 & 0\\ -1 & 2 & -2 & 0\\ 0 & -1 & 2 & -1\\ 0 & 0 & -1 & 2 \end{array} \right) & \begin{tikzpicture}[scale=0.7] \draw (1,0) -- (2,0); \draw (3,0) -- (4,0); \draw (2,0.1) -- (3,0.1); \draw (2,-0.1) -- (3,-0.1); \draw (2.4,0.2) -- (2.6,0) -- (2.4,-0.2); \filldraw[color=black] (1,0) circle (5pt); \filldraw[color=white] (1,0) circle (4pt); \filldraw[color=black] (2,0) circle (5pt); \filldraw[color=white] (2,0) circle (4pt); \filldraw[color=black] (3,0) circle (5pt); \filldraw[color=white] (3,0) circle (4pt); \filldraw[color=black] (4,0) circle (5pt); \filldraw[color=white] (4,0) circle (4pt); \end{tikzpicture}\\ & & \\ \hline & & \\ G_2 & \left( \begin{array}{rr} 2 & -1\\ -3 & 2 \end{array} \right) & \begin{tikzpicture}[scale=0.7] \draw (1,0.1) -- (2,0.1); \draw (1,-0.1) -- (2,-0.1); \draw (1,0) -- (2,0); \draw (1.6,0.2) -- (1.4,0) -- (1.6,-0.2); \filldraw[color=black] (1,0) circle (5pt); \filldraw[color=white] (1,0) circle (4pt); \filldraw[color=black] (2,0) circle (5pt); \filldraw[color=white] (2,0) circle (4pt); \end{tikzpicture}\\ & & \\ \hline \end{array}$ Then, I place the latex2img program somewhere accessible by my path. Then, we issue the following command. jason@descartes:~$ latex2img dynkin.tex ################################################ # latex2img version 0.1 # Jason B. Hill (www.jasonbhill.com) ################################################ Executing command: pdflatex -shell-escape tmp.dynkin.tex (success) pdflatex -shell-escape tmp.dynkin.tex Executing command: pdfcrop tmp.dynkin.pdf tmp.dynkin.pdf (success) pdfcrop tmp.dynkin.pdf tmp.dynkin.pdf Executing command: pdf2svg tmp.dynkin.pdf dynkin.svg (success) pdf2svg tmp.dynkin.pdf dynkin.svg (success) rm tmp.dynkin* (temp files removed)

The resulting picture is shown below. (O.K., so I just realized that my website isn’t currently set up to handle svg files. I’ll change that soon. For now, here’s the png.)

A couple people have asked me recently how to properly format integrals in $\LaTeX$, be it controlling where the limits are displayed or adding half spaces before the differentials (a formatting standard that too many don’t even know about). This isn’t the first time those questions have been raised, so I’m going to address some of them here.

This is also my first post using MathJax. I may write a short article soon about MathJax, as I’ve found that integrating it into this site is ridiculously easy. MathJax allows for vector-based and $\LaTeX$/MathML-based math rendering on webpages, instead of the raster graphics you find on many sites having math content. Basically, I’m a big fan.

Proper Spacing inside Integrals

First, I’ll give an example of an improperly written improper integral. (That’s a pun.) The code in $\LaTeX$ that most would write looks like this:

$\lim_{t\to\infty}\int_a^tf(x)dx$

When rendered by $\LaTeX$ that code snippet produces the following:

$$\lim_{t\to\infty}\int_a^tf(x)dx$$

What’s wrong with this? The answer is that there should be a half space between the integrand and the $dx$. That is, we should have something more like

$\lim_{t\to\infty}\int_a^tf(x)\,dx$

which will appear as

$$\lim_{t\to\infty}\int_a^tf(x)\,dx.$$

This is a very tiny difference. (There is a half space added between the $f(x)$ and the $dx$.) But, it is a matter of anal professionalism that many mathematicians follow. Look in any modern calculus text and you’ll notice that the half spaces are included. Look on a graduate TA’s calc II final, and the half spaces will probably not be included.

Display Styles

Most of those writing $\LaTeX$ know the difference between inline and display modes. The same integral written inline will appear differently than when written in display mode. As an example, consider the inline integral $\int_0^2x^2\,dx$, which in display mode appears as

$$\int_0^2x^2\,dx.$$

Sometimes, for various reasons, we may want to force $\LaTeX$ to use display mode in inline text. We may force the display mode version $\displaystyle\int_0^2x^2\,dx$ in place of the inline version $\int_0^2x^2\,dx.$ We do this using the displaystyle command, as in:

We may force the display mode version $\displaystyle\int_0^2x^2\,dx$ in place...

If you tinker around with this a bit, you discover that $\LaTeX$ will descend to inline mode inside a nested sequence of commands in display mode. For instance, if we write

$f(x) = \frac{\int_0^1g(x)\,dx}{\int_0^1h(x)\,dx}$

then we would get

$$f(x) = \frac{\int_0^1g(x)\,dx}{\int_0^1h(x)\,dx}.$$

I don’t particularly like the way this looks, so I’ll use

$f(x) = \frac{\displaystyle\int_0^1g(x)\,dx}{\displaystyle\int_0^1h(x)\,dx}$

instead, which gives me

$$f(x) = \frac{\displaystyle\int_0^1g(x)\,dx}{\displaystyle\int_0^1h(x)\,dx}.$$

Moving Limits

One thing you’ll notice is that the integration limits are placed differently between inline and display modes in $\LaTeX$. In display mode, the default is to place the limits of integration at a southeast and northeast location relative to the integral sign. Sometimes we wish to place the limits directly under (or directly over) the integral instead. Notice how the limits command is used in the following:

$\int\limits_{x^2 + y^2 \leq R^2} f(x,y)\,dx\,dy = \int\limits_{\theta=0}^{2\pi}\ \int\limits_{r=0}^R f(r\cos\theta,r\sin\theta) r\,dr\,d\theta$

This produces the output:

$$\int\limits_{x^2 + y^2 \leq R^2} f(x,y)\,dx\,dy = \int\limits_{\theta=0}^{2\pi}\ \int\limits_{r=0}^R f(r\cos\theta,r\sin\theta) r\,dr\,d\theta.$$

OK. I cheated a bit by putting the extra space between the two integral signs, but any experienced $\LaTeX$ user knows that sometimes the typesetting has to be coerced a bit.