I recently gave a talk on the Sage Notebook server that we recently constructed for members of our university community. In that talk, I gave a short introduction to Cython. I’m going to post some of the examples from that portion of the talk here, and explain a bit about what’s going on behind the scenes. Let me start by saying that I think Cython is amazing. The main idea is to create compiled C objects from Python code, callable from Python code.

Why would you want to do such a thing? In as few words as possible: Sage’s Python balances a weakly typed environment with an environment that is friendly to code in. The problem is, the type casts and function calls that are made can cause Sage’s Python (or Python in general) to slow down considerably. By compiling C code with strongly typed variables, we push the important calculations much closer to efficient machine code.

I’m doing these specific examples in the Sage Notebook. Cython was originally developed as part of Sage. If you use your Google foo, you can find info about the history of Cython and so on. I really just want to look at a few examples here.

Recursively Generated Fibonacci Numbers

Let’s say that you wanted to create a simple function in Python to calculate Fibonacci numbers. I wouldn’t recommend doing the following, because it is incredibly recursive and inefficient. However, sometimes when I need to peg the processing power of a machine and it has Python, the following function does get the job done.

def fibonacci(n):
    if n == 0: return 0
    elif n == 1: return 1
    else: return fibonacci(n-1) + fibonacci(n-2)

If I run this in Sage and ask for the runtime, I get the following.

time fibonacci(33)
 
3524578
Time: CPU 6.53 s, Wall: 6.53 s

So, we’ve found the 33rd Fibonacci number in roughly 6.53 seconds. (The CPU time in essence measures the actual computation time. The Wall time measures the overall time. It could happen that the machine becomes overloaded and your process sits in a queue before or while being run, in which case the wall time will increase while the CPU time shouldn’t change much.)

How can we improve this using Cython? From the standpoint of using the Sage Notebook, the only thing I really have to do in this instance is to change two very small details. The function will now look like this.

%cython
def fibonacci(int n):
    if n == 0: return 0
    elif n == 1: return 1
    else: return fibonacci(n-1) + fibonacci(n-2)

The first thing I’ve changed is I’ve added “%cython” to the top line of the cell in the notebook worksheet where my code is located. This simply tells Sage that I want to use Cython instead of the usual Sage Python. The second thing I have done is that I’ve declared the variable “n” as “int n”. I’ve told Sage that the specific C datatype that I want to use for “n” is an int. The complication here is that you need to know something about C datatypes. C is strongly typed and, unlike Python, being competent in C means knowing what the types are and knowing precisely how those types get stored and accessed in system memory. More on that, and why it matters, in a bit. Let’s see how our Cython code runs in the same calculation as before.

time fibonacci(33)
 
3524578
Time: CPU 1.60 s, Wall: 1.61 s

Great. We went from 6.53 seconds to 1.60 seconds of computation time. Given that we did very little work to modify our code, that seems like a good deal. Now, I’ve glossed over a bit of a detail here. When I execute the cell containing my Cython function in the Sage Notebook, what actually happens is as follows. Cython converts chunks of my code to C, compiles that C code, and makes it accessible to Sage’s Python. When this happens, a link to the C code becomes available. In this case, the C code can be found here exactly as it was produced by Cython:

__home_sageserver__sage_sage_notebook_sagenb_home_hilljb_37_code_sage5_spyx.c

If you read that, you’ll notice that it’s considerably different than any reasonable human would write the function in C. But, it compiled and obviously sped up our calculation.

The Sum of Integers From 0 to n

Here’s an example of where Cython can give much greater speedup, but where one also much be very careful. Let’s consider a basic Python function to sum all the integers from 0 to some positive integer n. We have something like the following.

def sum_all(n):
    sum = 0
    for i in range(n+1):
        sum = sum + i
    return sum

Running this function on a suitably large integer goes something like this.

time sum_all(10^8)
 
5000000050000000
Time: CPU 19.11 s, Wall: 19.12 s

You’d be tempted to write the same function in Cython like this:

%cython
def sum_all(int n):
    cdef int sum = 0
    cdef int i = 0
    while i <= n:
        sum = sum + i
        i = i + 1
    return sum

The input value for the function is again a C int. Inside the function, we define the variable “sum” as a C int set initially to zero. We also define a C int variable named “i”, on which we will perform our recursion. When we execute this code, it compiles fine. When we run it, we get the following.

time sum_all(10^8)
 
987459712
Time: CPU 0.16 s, Wall: 0.16 s

There are two things of note here: (1) It ran MUCH faster than the Python version. (2) The answer returned is wrong. The reason why the answer is wrong is because I haven’t used the correct C datatypes. Any C programmer would know that a C int (depending on the system architecture and compiler) won’t be capable of holding the large integers in question. The data will simply overflow the available memory space and the program won’t care at all. We need to make sure that the C datatypes we use are actually capable of holding the data we’re going to throw at them. We need an integer type in this situation that will hold a larger integer. So, I’ll go all out and use unsigned long long ints instead. That looks like this:

%cython
def sum_all(long long unsigned int n):
    cdef long long unsigned int sum = 0
    cdef long long unsigned int i = 0
    while i <= n:
        sum = sum + i
        i = i + 1
    return sum

And, when we run the calculation, we get the speed-up that we want (roughly 174 times over the original Python version) and the correct answer, in this case returned as a Python long:

time sum_all(10^8)
 
5000000050000000L
Time: CPU 0.11 s, Wall: 0.11 s

A while ago, a friend asked me the following question: “Is it possible for the horizontal component of the baseball’s velocity to increase in magnitude and not be negated when the batter hits the ball?” That is, can the batter hit the ball in such as way as to make the ball continue traveling toward the catcher, yet increase the horizontal velocity of the ball? He posted this question on some forums and received some varied responses. Half the respondents said no and half said yes. He asked me, so I figured I’d test it out in sage.

The first thing to point out is that this is still a very unrealistic calculation. I did my best to quickly account for properties the ball may have (size, weight, elasticity, etc.) and made a quick graphical representation of the ball and bat, with vectors representing their velocities. Here’s the code, and then I’ll discuss some of the findings.

def mph_to_mps(num):
    return 0.44704*num
 
def mps_to_mph(num):
    return 2.23693*num
 
def deg_to_rad(ang):
    return ang*pi/180
 
def rad_to_deg(ang):
    return ang*180/pi
 
def oz_to_g(mass):
    return 28.34952*mass
 
def get_velocity(speed,angle):
    # speed must be in meters per second
    # angle is measured from horizon in radians
    return vector([speed*cos(angle),speed*sin(angle)])
 
def get_momentum(mass,velocity):
    # mass must be in grams
    # velocity must be a vector
    return mass*velocity
 
@interact
def _(ball=['Baseball', '11Softball', '12Softball'],
      ball_angle=slider(0,90,0.05,default=20,label='Ball Angle (degrees from horizontal)'),\
      ball_speed=slider(10,90,0.1,default=70,label='Ball Speed (mph)'),\
      swing_angle=slider(-45,45,0.1,default=10,label='Swing Angle (degrees from horizontal)'),\
      bat_speed=slider(10,55,0.1,default=30,label='Bat Speed (mph)'),\
      bat_mass=slider(17,42,1,default=30,label='Bat Weight (in ounces)'),\
      impact_angle=slider(-90,90,0.05,default=10,label='Impact Angle (relative to swing)'),\
      efficiency=slider(80,99.5,0.5,default=96,label='efficiency factor (%)'),\
      elasticity=slider(0,1,0.05,default=0.15,label='elasticity factor of ball')):
          if ball is 'Baseball':
              ball_rad = 0.03692394#meters
              ball_mass = 145#grams
          if ball is '11Softball':
              ball_rad = 0.04446789#meters
              ball_mass = 174#grams
          if ball is '12Softball':
              ball_rad = 0.04851019#meters
              ball_mass = 206#grams
          ball_angle = deg_to_rad(-ball_angle)
          print "Pitched Ball Speed (x-coord): %s mph" % n(cos(ball_angle)*ball_speed,digits=5)
          print "Pitched Ball Speed (y-coord): %s mph" % n(sin(ball_angle)*ball_speed,digits=5)
          ball_speed = mph_to_mps(ball_speed)
          ball_velocity = get_velocity(ball_speed,ball_angle)
          ball_momentum = get_momentum(ball_mass,ball_velocity)
          DBallMV = (1/ball_momentum.norm())*5*ball_momentum
          swing_angle = pi-deg_to_rad(swing_angle)
          bat_speed = mph_to_mps(bat_speed)
          bat_velocity = get_velocity(bat_speed,swing_angle)
          bat_mass = oz_to_g(bat_mass)
          bat_momentum = get_momentum(bat_mass*efficiency/100,bat_velocity)
          DBatMV = (1/bat_momentum.norm())*5*bat_momentum
          impact_angle=swing_angle-deg_to_rad(impact_angle)
          # find bat_momentum projection vector parallel to impact_angle
          moment_hit = vector([cos(impact_angle),sin(impact_angle)])
          moment_hit = moment_hit.dot_product(bat_momentum)
          moment_hit = moment_hit*vector([cos(impact_angle),sin(impact_angle)])
          # find approximate force of ball acceleration moment
          moment_ball = vector([cos(impact_angle),sin(impact_angle)])
          moment_ball = moment_ball.dot_product(ball_momentum)*elasticity
          moment_ball = -moment_ball*vector([cos(impact_angle),sin(impact_angle)])
          # add momentum components
          new_moment = moment_hit + ball_momentum + moment_ball
          DBallMV2 = (1/new_moment.norm())*5*new_moment
          ball_velocity = (1/ball_mass)*new_moment
          print "Hit Ball Speed (x-coord): %s mph" % n(mps_to_mph(ball_velocity[0]),digits=5)
          print "Hit Ball Speed (y-coord): %s mph" % n(mps_to_mph(ball_velocity[1]),digits=5)
          ball_speed = ball_velocity.norm()
          print "Hit Ball Speed (overall): %s mph" % n(mps_to_mph(ball_speed),digits=5)
          # Draw
          DBall = circle((0,0),radius=1,rgbcolor=(0,0,0))
          BatCoord1 = (-2*cos(impact_angle),2*sin(-impact_angle))
          BatCoord2 = (-2*cos(impact_angle)+DBatMV[0],2*sin(-impact_angle)+DBatMV[1])
          DBat = circle(BatCoord1,radius=1,rgbcolor='brown',fill=True)
          DBallM = arrow((0,0),(DBallMV[0],DBallMV[1]),color='red')
          DBatM = arrow(BatCoord1,BatCoord2,color='blue')
          DBallM2 = arrow((0,0),(DBallMV2[0],DBallMV2[1]),color='green')
          show(DBall+DBat+DBallM+DBatM+DBallM2,aspect_ratio=1,axes=False)
          print "Direction Legend: Red (pitched ball), Blue (bat), Green (hit ball)"

When executed in the sage notebook, this creates an interactive slider-based tool that allows me to change all the relevant variables (at least the ones that I’ve accounted for). By default, certain reasonable values are used to construct the situation. Here’s what we get.

I’m considering this in mile per hour because that what all the statistics in American baseball are given in. The white circle is the ball, the red circle is the bat. The arrows represent vectors: red is the velocity vector of the pitched baseball at the instant just before striking the bat, blue is the velocity vector of the bat at the instant just before striking the ball, and green is the velocity of the hit baseball.

There are many factors to consider here. The ball and bat are traveling at different angles relative to the ground. The ball may not hit the bat squarely in the center (the hit ball may be tipped back). We can simulate that situation by setting the impact angle higher on the bat. The impact angle in this bit of code is the angle where the ball hits the bat relative to the swinging velocity of the bat.

Notice that in this example, the x-coordinate velocity of the ball is lower after hitting the bat. That is, the ball has slowed down its horizontal speed. I basically used the sliders to find a combination of variable values that would permit precisely the situation that my friend was looking for, making the x-coordinate velocity of the ball increase after striking the bat. Here’s an example.

Notice that the ball increases in its horizontal speed from 65.685 to 67.331 miles per hour. If the swing angle and impact angle are great enough, this situation occurs much more frequently than one might expect.

Hit Fastballs Don’t Fly Farther

This is something that is accepted as common knowledge in baseball and has never sat well with me. What most any baseball fan will tell you is that a fastball, if hit, will travel farther than a ball thrown at a slower speed and hit by the same batter. Is this true?

The justification for such an idea can be seen in the following experiment. Imagine a pitcher throws a ball at a brick wall. The harder the pitcher throws, the farther the ball will bounce back due to the ball’s elasticity. It stands to reason that the same effect occurs when a batter hits a fastball.

The only problem with this reasoning is that it doesn’t account for the momentum or movement of the wall or bat. The wall doesn’t budge when hit by the baseball, and the baseball’s energy gets transferred through elasticity and pushes the ball backwards. On the other hand, the bat is much, much lighter and isn’t in a fixed location. The momentum of a fastball will exert more force on a bat than that of a slowly thrown ball, slowing the bat and making it have less of an impact on the distance the ball carries.

Not convinced? Imagine what would happen if we were to take this to an extreme. Imagine the ball starts to travel at 1,000,000 miles per hour and, somehow, the batter swings and connects. At that point, even though the ball is very light, it has an incredible amount of momentum due to its velocity. Upon hitting the bat, the ball would demolish the bat and probably just keep going toward a catcher who is about to have a very bad day. Consider the following two situations.

Notice that the ball is pitched at 75 mph and the ball ends up traveling 102.39 mph after striking the bat. Now, we’ll change the pitched velocity to 85.79 mph and watch what happens to the final speed of the ball. It goes down to 92.93 mph.


On my laptop, I run Crunchbang Linux, due to the fact that it’s a single core Intel CULV processor (read: full size laptop, lightweight, extremely low power consumption with a max TDP of 5.5W, connect to a server to do anything serious). I run Conky as a system monitor. This basic post explains how you can write Python scripts and have them executed by Conky.

For instance, I like to be able to have my CPU temp displayed by Conky. To do so, I use the following incredibly basic Python script.

#! /usr/bin/env python
 
import subprocess
 
cmd = 'grep temp /proc/acpi/thermal_zone/TZ00/temperature'
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
 
output = p.stdout.read()
output = output.split(': ')[1].split('\n')[0]
 
print "%s" % output

I should point out that, based on your motherboard sensors and specific OS, your temperature information may be harder to come by.

I saved that Python script as an executable named “tempwatch” in ~/bin (which is in my PATH). Once that was set up, you can edit your ~/.conkyrc file to include the following line

Temperature:$alignr${execi 10 tempwatch}

This tells Conky to reload the temperature information every 10 seconds and display it.

Pretty simply.


A hard drive is going bad and I need to back it up (or, backup as much as possible) before it totally dies. I’ll clone the drive, to a new, working hard drive. This is also a handy way of completely duplicating a drive for backup or other purposes.

I’m going to use the command dd. This is an old UNIX command to convert and copy files. It will mirror the two drives for me. Of course, I can’t be logged in to the OS on the drive that I want to copy, so I’m going to use a live CD. I’m using an Ubuntu disc, because it’s what I have laying around.

  1. Boot into the live CD.
  2. Open a terminal, or use Ctrl+Alt+F1 to change virtual screens (Ctrl+Alt+F7 gets you back).
  3. Find where your drives are located. You can do this by typing “fdisk -l” or looking around in /dev. The drive I want to copy is /dev/sda and the destination drive is /dev/sdb
  4. Use the command “dd if=/dev/sda of=/dev/sdb conv=noerror,sync bs=1024″ where the “if” stands for “input-file”, “of” is “output-file”, “conv” gives a list of conversion options (in this case I’m saying to ignore read errors and “sync” tells dd to pad the blocks on the new drive with zeros), and “bs” gives the byte-size for incremental copying.
Now, sit back and wait a while.

I usually $\LaTeX$ lecture notes in my classes, both those I take and those I teach. (If you’re a coder and you write $\LaTeX$ for absolutely everything you do in class for five years or so, you get really fast at it.) I’ve yet to find a good $\LaTeX\rightarrow\text{HTML}$ converter, and I’ve tried several. There are good reasons for this, the least not being that the markups of these languages are different enough to cause a lot of difficulty in the translation.

With MathJax (of which I am a huge fan) allowing for specific bits of $\LaTeX$ to be entered directly in to web sources via scripts, a lot of the problem is solved. But, I still want to be able to include parts and packages of $\LaTeX$ not supported by MathJax, such as PGF and TikZ.

I’m going to approach this from a modular perspective, bit by bit over time. The first thing that needs to be done is I need to write a converter that will form *.svg files from graphics created via pdflatex. I’m going to write this in Python, partially because it is well-suited to the task and partially because it will be fast to develop. That’s what this post will develop. After that is done, I’ll need to parse $\LaTeX$ code into three groups: (1) Text that can be formatted in HTML, (2) $\LaTeX$ commands that can transfer directly to MathJax, and (3) $\LaTeX$ commands that will need to be converted to *.svg files.

Dependencies

I’m writing this a bit selfishly for myself. I use Debian based Linux distributions and I’ll need the following to be callable in my path: pdflatex, pdfcrop, pdf2svg, standard BASH commands such as ‘rm’.

Main Idea

The idea is that each bit of $\LaTeX$ in my finished document that is not text and cannot be given to MathJax should be stripped from my document and placed on a separate page. Each page is then written as a pdf, then cropped (the page style is empty, so there are no page numbers and so on) to the bounding box of the generated image, then saved as a vector format svg file. Once complete, the images are hooked back into the HTML document among the MathJax code.

The Code

The Python code for the program latex2img is as follows.

#!/usr/bin/env python
 
###############################################################################
# Jason B. Hill (Jason.B.Hill@Colorado.edu)
###############################################################################
#   This program is free software: you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation, either version 3 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program.  If not, see <http://www.gnu.org/licenses/>
###############################################################################
 
import sys              # used to parse command line options
import subprocess       # used to execute shell commands / obtain exit status
import re               # used to format filenames
 
# version numbers are printed in tenths
latex2img_version = 0.1
# 9-8-11 Initial coding
 
###############################################################################
# Define functions
###############################################################################
 
def print_intro(version):
    print "################################################"
    print "# latex2img version %.1f" % version
    print "# Jason B. Hill (www.jasonbhill.com)"
    print "################################################"
 
def read_latex(file):
    lines = []
    try:
        FILE = open(file, "r")
    except:
        print "Error: file read error"
        sys.exit(0)
    for blob in FILE: lines.append(blob)
    return lines
 
def write_latex(file,latex_data):
    try:
        FILE = open(file, "w")
    except:
        print "Error: file write error"
        sys.exit(0)
    FILE.write("\\documentclass[letterpaper,11pt]{article}\n")
    FILE.write("\\usepackage{fullpage,amsfonts,amsmath,amssymb,tikz}\n\n")
    FILE.write("\\pagestyle{empty}\n\n")
    FILE.write("\\begin{document}\n\n")
    #FILE.write("\\[\n")
    for blob in latex_data: FILE.write(blob)
    #FILE.write("\\]\n\n")
    FILE.write("\\end{document}\n")
    FILE.close()
 
###############################################################################
# Parse options and raise errors if needed
###############################################################################
 
if len(sys.argv) < 2:
    print "Error: Too few options passed."
    print "Usage: latex2img file[.tex]"
    sys.exit(0)
elif len(sys.argv) > 2:
    print "Error: Too many options passed."
    print "Usage: latex2img file[.tex]"
    sys.exit(0)
try:
    file_path = sys.argv[1]
    FILE = open(file_path,"r")
except:
    print "Error: File  does not exist or is not readable." % file_path
    sys.exit(0)
 
###############################################################################
# Execute
###############################################################################
 
### Set filenames
p = re.compile("^(.*)\.(tex|tikz)$", re.IGNORECASE)
m = p.match(file_path)
if m:
    tex_filename = "tmp.%s.tex" % m.group(1)
    pdf_filename = "tmp.%s.pdf" % m.group(1)
    svg_filename = "%s.svg" % m.group(1)
    temp_path = "tmp.%s*" % m.group(1)
else:
    tex_filename = "tmp.%s.tex" % file_path
    pdf_filename = "tmo.%s.pdf" % file_path
    svg_filename = "%s.svg" % file_path
    temp_path = "tmp.%s*" % file_path
 
### Print information at shell
print_intro(latex2img_version)
 
### Read tikz info and place in a LaTeX file
latex_data = read_latex(file_path)
write_latex(tex_filename,latex_data)
 
### Execute pdflatex
command = "pdflatex -shell-escape %s" % tex_filename
print "Executing command: " + command
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
process.wait()
if process.returncode is 0:
    print "(success) " + command
 
### Execute pdfcrop
command = "pdfcrop %s %s" % (pdf_filename,pdf_filename)
print "Executing command: " + command
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
process.wait()
if process.returncode is 0:
    print "(success) " + command
 
### Execute pdf2svg
command = "pdf2svg %s %s" % (pdf_filename,svg_filename)
print "Executing command: " + command
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
process.wait()
if process.returncode is 0:
    print "(success) " + command
 
### Remove temporary files
command = "rm %s" % temp_path
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
process.wait()
if process.returncode is 0:
    print "(success) " + command + " (temp files removed)"

 Example

I’ll create a file named dynkin.tex somewhere in a directory with appropriate permissions. The file looks like this. (Notice that there are no $\LaTeX$ headers/footers or other definitions in the file.)

\[
\begin{array}{|c|c|c|}
\hline
\textbf{Type} & \textbf{Cartan Matrix} & \textbf{Dynkin Diagram}\\
\hline
\hline
 & & \\
E_6 & \left(
\begin{array}{rrrrrr}
2 & 0 & -1 & 0 & 0 & 0\\
0 & 2 & 0 & -1 & 0 & 0\\
-1 & 0 & 2 & -1 & 0 & 0\\
0 & -1 & -1 & 2 & -1 & 0\\
0 & 0 & 0 & -1 & 2 & -1\\
0 & 0 & 0 & 0 & -1 & 2
\end{array}
\right) &
\begin{tikzpicture}[scale=0.7]
    \draw (1,0) -- (5,0);
    \draw (3,0) -- (3,1);
 
    \filldraw[color=black] (1,0) circle (5pt);
    \filldraw[color=white] (1,0) circle (4pt);
    \filldraw[color=black] (2,0) circle (5pt);
    \filldraw[color=white] (2,0) circle (4pt);
    \filldraw[color=black] (3,0) circle (5pt);
    \filldraw[color=white] (3,0) circle (4pt);
    \filldraw[color=black] (4,0) circle (5pt);
    \filldraw[color=white] (4,0) circle (4pt);
    \filldraw[color=black] (5,0) circle (5pt);
    \filldraw[color=white] (5,0) circle (4pt);
    \filldraw[color=black] (3,1) circle (5pt);
    \filldraw[color=white] (3,1) circle (4pt);
\end{tikzpicture}\\
 
& & \\
\hline
& & \\
 
E_7 & \left(
\begin{array}{rrrrrrr}
2 & 0 & -1 & 0 & 0 & 0 & 0\\
0 & 2 & 0 & -1 & 0 & 0 & 0\\
-1 & 0 & 2 & -1 & 0 & 0 & 0\\
0 & -1 & -1 & 2 & -1 & 0 & 0\\
0 & 0 & 0 & -1 & 2 & -1 & 0\\
0 & 0 & 0 & 0 & -1 & 2 & -1\\
0 & 0 & 0 & 0 & 0 & -1 & 2
\end{array}
\right) &
\begin{tikzpicture}[scale=0.7]
    \draw (1,0) -- (6,0);
    \draw (3,0) -- (3,1);
 
    \filldraw[color=black] (1,0) circle (5pt);
    \filldraw[color=white] (1,0) circle (4pt);
    \filldraw[color=black] (2,0) circle (5pt);
    \filldraw[color=white] (2,0) circle (4pt);
    \filldraw[color=black] (3,0) circle (5pt);
    \filldraw[color=white] (3,0) circle (4pt);
    \filldraw[color=black] (4,0) circle (5pt);
    \filldraw[color=white] (4,0) circle (4pt);
    \filldraw[color=black] (5,0) circle (5pt);
    \filldraw[color=white] (5,0) circle (4pt);
    \filldraw[color=black] (6,0) circle (5pt);
    \filldraw[color=white] (6,0) circle (4pt);
    \filldraw[color=black] (3,1) circle (5pt);
    \filldraw[color=white] (3,1) circle (4pt);
\end{tikzpicture}\\
 
& & \\
\hline
& & \\
 
E_8 & \left(
\begin{array}{rrrrrrrr}
2 & 0 & -1 & 0 & 0 & 0 & 0 & 0\\
0 & 2 & 0 & -1 & 0 & 0 & 0 & 0\\
-1 & 0 & 2 & -1 & 0 & 0 & 0 & 0\\
0 & -1 & -1 & 2 & -1 & 0 & 0 & 0\\
0 & 0 & 0 & -1 & 2 & -1 & 0 & 0\\
0 & 0 & 0 & 0 & -1 & 2 & -1 & 0\\
0 & 0 & 0 & 0 & 0 & -1 & 2 & -1\\
0 & 0 & 0 & 0 & 0 & 0 & -1 & 2
\end{array}
\right) &
\begin{tikzpicture}[scale=0.7]
    \draw (1,0) -- (7,0);
    \draw (3,0) -- (3,1);
 
    \filldraw[color=black] (1,0) circle (5pt);
    \filldraw[color=white] (1,0) circle (4pt);
    \filldraw[color=black] (2,0) circle (5pt);
    \filldraw[color=white] (2,0) circle (4pt);
    \filldraw[color=black] (3,0) circle (5pt);
    \filldraw[color=white] (3,0) circle (4pt);
    \filldraw[color=black] (4,0) circle (5pt);
    \filldraw[color=white] (4,0) circle (4pt);
    \filldraw[color=black] (5,0) circle (5pt);
    \filldraw[color=white] (5,0) circle (4pt);
    \filldraw[color=black] (6,0) circle (5pt);
    \filldraw[color=white] (6,0) circle (4pt);
    \filldraw[color=black] (7,0) circle (5pt);
    \filldraw[color=white] (7,0) circle (4pt);
    \filldraw[color=black] (3,1) circle (5pt);
    \filldraw[color=white] (3,1) circle (4pt);
\end{tikzpicture}\\
 
& & \\
\hline
& & \\
 
F_4 & \left(
\begin{array}{rrrr}
2 & -1 & 0 & 0\\
-1 & 2 & -2 & 0\\
0 & -1 & 2 & -1\\
0 & 0 & -1 & 2
\end{array}
\right) &
\begin{tikzpicture}[scale=0.7]
    \draw (1,0) -- (2,0);
    \draw (3,0) -- (4,0);
    \draw (2,0.1) -- (3,0.1);
    \draw (2,-0.1) -- (3,-0.1);
    \draw (2.4,0.2) -- (2.6,0) -- (2.4,-0.2);
 
    \filldraw[color=black] (1,0) circle (5pt);
    \filldraw[color=white] (1,0) circle (4pt);
    \filldraw[color=black] (2,0) circle (5pt);
    \filldraw[color=white] (2,0) circle (4pt);
    \filldraw[color=black] (3,0) circle (5pt);
    \filldraw[color=white] (3,0) circle (4pt);
    \filldraw[color=black] (4,0) circle (5pt);
    \filldraw[color=white] (4,0) circle (4pt);
\end{tikzpicture}\\
 
& & \\
\hline
& & \\
 
G_2 & \left(
\begin{array}{rr}
2 & -1\\
-3 & 2
\end{array}
\right) &
\begin{tikzpicture}[scale=0.7]
    \draw (1,0.1) -- (2,0.1);
    \draw (1,-0.1) -- (2,-0.1);
    \draw (1,0) -- (2,0);
    \draw (1.6,0.2) -- (1.4,0) -- (1.6,-0.2);
 
    \filldraw[color=black] (1,0) circle (5pt);
    \filldraw[color=white] (1,0) circle (4pt);
    \filldraw[color=black] (2,0) circle (5pt);
    \filldraw[color=white] (2,0) circle (4pt);
\end{tikzpicture}\\
 
& & \\
\hline
\end{array} 
\]

Then, I place the latex2img program somewhere accessible by my path. Then, we issue the following command.

jason@descartes:~$ latex2img dynkin.tex
################################################
# latex2img version 0.1
# Jason B. Hill (www.jasonbhill.com)
################################################
Executing command: pdflatex -shell-escape tmp.dynkin.tex
(success) pdflatex -shell-escape tmp.dynkin.tex
Executing command: pdfcrop tmp.dynkin.pdf tmp.dynkin.pdf
(success) pdfcrop tmp.dynkin.pdf tmp.dynkin.pdf
Executing command: pdf2svg tmp.dynkin.pdf dynkin.svg
(success) pdf2svg tmp.dynkin.pdf dynkin.svg
(success) rm tmp.dynkin* (temp files removed)

The resulting picture is shown below. (O.K., so I just realized that my website isn’t currently set up to handle svg files. I’ll change that soon. For now, here’s the png.)


Here’s a problem you can encounter if you’re working with memory intensive calculations using integer types in C and you want your code to be portable (i.e., you want it to function the same way on multiple platforms). The ISO/IEC 9899:1990 standard specified that C should have four signed and unsigned integer types: char (yes, char is an integer type), short, int, and long. In 1999, _Bool was added as a single bit integer type and long long was added. The standards don’t specify the size of these integer types (how many bytes they should use), other than saying int and short should be at least 16 bits, while long should have at least the size of int and not be smaller than 32 bits, and long long should likewise be at least the size of long.

The problem arising from this vagueness can be demonstrated in the following little C program:

/* int-type-memsize.c                                                        */
/* Jason B. Hill (jason@jasonbhill.com)                                      */
/*                                                                           */
/* Returns the memory size of integer data types in C                        */
 
#include <stdio.h>
 
int main(void) {
    printf("--------------------------------------------------------------\n");
    printf("C Integer data types on this %d-bit machine\n", __WORDSIZE);
    printf("--------------------------------------------------------------\n");
 
    printf("short                    %lu bytes\n", sizeof(short));
    printf("int                      %lu bytes\n", sizeof(int));
    printf("long                     %lu bytes\n", sizeof(long));
    printf("long long                %lu bytes\n", sizeof(long long));
 
    return 0;
} /* main */

Compiling this with GCC on my 64-bit Xubuntu desktop gives the following result:

--------------------------------------------------------------
C Integer data types on this 64-bit machine
--------------------------------------------------------------
short                    2 bytes
int                      4 bytes
long                     8 bytes
long long                8 bytes

The same code compiled with Visual C++ and executed on a 64-bit Windows machine gives this result:

--------------------------------------------------------------
C Integer data types on this 64-bit machine
--------------------------------------------------------------
short                    2 bytes
int                      4 bytes
long                     4 bytes
long long                8 bytes

And here’s what happens with GCC on a 32-bit Red Hat machine:

--------------------------------------------------------------
C Integer data types on this 32-bit machine
--------------------------------------------------------------
short                    2 bytes
int                      4 bytes
long                     4 bytes
long long                8 bytes

Obviously, there is something more complicated going on here than the difference between 32-bit and 64-bit architectures. This becomes even more complicated when the code is run on older 64-bit UNICOS based systems like the CRAY T3E, as short, int, long, and long long were all 8 bytes. The differences in these systems come from the fact that Microsoft’s Visual C++ compiler uses the LLP64 model, while most UNIX systems use LP64, and UNICOS was SILP64. You can find out more about what this means at The Open Group’s page and Wikipedia’s entry on 64-bit architectures.

This makes portable code with the standard C integer types a pain for two reasons. Firstly, the range of values capable of being stored in any signed or unsigned integer type varies by machine. Secondly, the memory requirements of data structures using various integer types also varies, sometimes drastically. If you want better control over integer data types in C, allowing you to avoid these issues, you need to use the inttypes.h standard header. We can rewrite our program above using inttypes.h as follows.

/* int-types-portable-memsize.c                                              */
/* Jason B. Hill (jason@jasonbhill.com)                                      */
/*                                                                           */
/* Returns the memory size of portable integer data types in C               */
 
#include <stdio.h>
#include <inttypes.h>
 
int main(void) {
    printf("--------------------------------------------------------------\n");
    printf("C Integer data types on this %d-bit machine\n", __WORDSIZE);
    printf("--------------------------------------------------------------\n");
 
    printf("int8_t                    %lu bytes\n", sizeof(int8_t));
    printf("int16_t                   %lu bytes\n", sizeof(int16_t));
    printf("int32_t                   %lu bytes\n", sizeof(int32_t));
    printf("int64_t                   %lu bytes\n", sizeof(int64_t));
 
    return 0;
} /* main */

On any machine with a competent C compiler, we should now get consistent results like this:

--------------------------------------------------------------
C Integer data types on this 32-bit machine
--------------------------------------------------------------
int8_t                    1 bytes
int16_t                   2 bytes
int32_t                   4 bytes
int64_t                   8 bytes

Of course, we’ve masked something a bit here. The above code uses sizeof(int32_t) inside a printf call, and we know that the integer returned by sizeof() is then used as an unsigned long by the fact that we’ve asked C to print “%lu“. What if we simply want to print a variable saved as type int32_t? On some systems, this may correspond to a long and on others it may be a long long. Thus, inttypes.h provides the appropriate printf and scanf macros for dealing with each of the data types it defines. Here’s a table to summarize the type and macro definitions.

Type Description [min,max] value range printf scanf
int8_t 8-bit signed integer \(\left[-2^7,2^7-1\right]\) PRId8 SCNd8
uint8_t 8-bit unsigned integer \(\left[0,2^8-1\right]\) PRIu8 SCNu8
int16_t 16-bit signed integer \(\left[-2^{15},2^{15}-1\right]\) PRId16 SCNd16
uint16_t 16-bit unsigned integer \(\left[0,2^{16}-1\right]\) PRIu16 SCNu16
int32_t 32-bit signed integer \(\left[-2^{31},2^{31}-1\right]\) PRId32 SCNd32
uint32_t 32-bit unsigned integer \(\left[0,2^{32}-1\right]\) PRIu32 SCNu32
int64_t 64-bit signed integer \(\left[-2^{63},2^{63}-1\right]\) PRId64 SCNd64
uint64_t 64-bit unsigned integer \(\left[0,2^{64}-1\right]\) PRIu64 SCNu64
Posted in C

A couple people have asked me recently how to properly format integrals in \(\LaTeX\), be it controlling where the limits are displayed or adding half spaces before the differentials (a formatting standard that too many don’t even know about). This isn’t the first time those questions have been raised, so I’m going to address some of them here.

This is also my first post using MathJax. I may write a short article soon about MathJax, as I’ve found that integrating it into this site is ridiculously easy. MathJax allows for vector-based and \(\LaTeX\)/MathML-based math rendering on webpages, instead of the raster graphics you find on many sites having math content. Basically, I’m a big fan.

Proper Spacing inside Integrals

First, I’ll give an example of an improperly written improper integral. (That’s a pun.) The code in \(\LaTeX\) that most would write looks like this:

\[
\lim_{t\to\infty}\int_a^tf(x)dx
\]

When rendered by \(\LaTeX\) that code snippet produces the following:

$$\lim_{t\to\infty}\int_a^tf(x)dx$$

What’s wrong with this? The answer is that there should be a half space between the integrand and the \(dx\). That is, we should have something more like

\[
\lim_{t\to\infty}\int_a^tf(x)\,dx
\]

which will appear as

$$\lim_{t\to\infty}\int_a^tf(x)\,dx.$$

This is a very tiny difference. (There is a half space added between the \(f(x)\) and the \(dx\).) But, it is a matter of anal professionalism that many mathematicians follow. Look in any modern calculus text and you’ll notice that the half spaces are included. Look on a graduate TA’s calc II final, and the half spaces will probably not be included.

Display Styles

Most of those writing \(\LaTeX\) know the difference between inline and display modes. The same integral written inline will appear differently than when written in display mode. As an example, consider the inline integral \(\int_0^2x^2\,dx\), which in display mode appears as

$$\int_0^2x^2\,dx.$$

Sometimes, for various reasons, we may want to force \(\LaTeX\) to use display mode in inline text. We may force the display mode version \(\displaystyle\int_0^2x^2\,dx\) in place of the inline version \(\int_0^2x^2\,dx.\) We do this using the displaystyle command, as in:

We may force the display mode version $\displaystyle\int_0^2x^2\,dx$ in place...

If you tinker around with this a bit, you discover that \(\LaTeX\) will descend to inline mode inside a nested sequence of commands in display mode. For instance, if we write

\[
f(x) = \frac{\int_0^1g(x)\,dx}{\int_0^1h(x)\,dx}
\]

then we would get

$$f(x) = \frac{\int_0^1g(x)\,dx}{\int_0^1h(x)\,dx}.$$

I don’t particularly like the way this looks, so I’ll use

\[
f(x) = \frac{\displaystyle\int_0^1g(x)\,dx}{\displaystyle\int_0^1h(x)\,dx}
\]

instead, which gives me

$$f(x) = \frac{\displaystyle\int_0^1g(x)\,dx}{\displaystyle\int_0^1h(x)\,dx}.$$

Moving Limits

One thing you’ll notice is that the integration limits are placed differently between inline and display modes in \(\LaTeX\). In display mode, the default is to place the limits of integration at a southeast and northeast location relative to the integral sign. Sometimes we wish to place the limits directly under (or directly over) the integral instead. Notice how the limits command is used in the following:

\[
\int\limits_{x^2 + y^2 \leq R^2} f(x,y)\,dx\,dy
= \int\limits_{\theta=0}^{2\pi}\ \int\limits_{r=0}^R f(r\cos\theta,r\sin\theta) r\,dr\,d\theta
\]

This produces the output:

$$\int\limits_{x^2 + y^2 \leq R^2} f(x,y)\,dx\,dy
= \int\limits_{\theta=0}^{2\pi}\ \int\limits_{r=0}^R f(r\cos\theta,r\sin\theta) r\,dr\,d\theta.$$

OK. I cheated a bit by putting the extra space between the two integral signs, but any experienced \(\LaTeX\) user knows that sometimes the typesetting has to be coerced a bit. :-)


The iso646.h header file in the C standard library isn’t found in many texts on C. It was added to the standard library in 1995 as an amendment to the C90 standard. It is one of the more basic library header files, allowing for (1) international non-QWERTY keyboards to more easily write C without digraphs and trigraphs, and (2) bitwise and logical operators in more natural language. The only job of the header is to convert macros (on the left in the below table) to C tokens (on the right).

and     &&
and_eq  &=
bitand  &
bitor   |
compl   ~
not     !
not_eq  !=
or      ||
or_eq   |=
xor     ^
xor_eq  ^=

Interestingly, these are all operator keywords in C++. There is a ciso646 header for C++ which may be included for consistency, yet it has no effect because it is empty.

Posted in C