The arithmetic sequence, 1487, 4817, 8147, in which each of the terms increases by 3330, is unusual in two ways: (i) each of the three terms are prime, and, (ii) each of the 4-digit numbers are permutations of one another.

There are no arithmetic sequences made up of three 1-, 2-, or 3-digit primes, exhibiting this property, but there is one other 4-digit increasing sequence.

What 12-digit number do you form by concatenating the three terms in this sequence?

### Sage Solution

There are really only a handful of ways to go about this, and the timing isn’t going to be too substantial. So, I’m simply going to write an in-line Sage code block and execute it. The idea is to get a list of primes between 1,000 and 10,000, sort their digit decompositions, and then (after seeing which primes have the same digit decompositions), see which prime subsets have an arithmetic sequence underlying the distances between them. No too horribly interesting, but the code below does the job.

import time   start = time.time()   # make a list of primes P = prime_range(1000,10000)   # for each prime, make a list of digits, and sort that list L = [p.digits() for p in P] for i in range(len(L)): L[i].sort() L.sort()   # make a list of digits that appear for at least 3 primes M = [] for i in range(len(L)-3): if L[i]==L[i+1] and L[i+1]==L[i+2] and L[i+2]==L[i+3]: if L[i] not in M: M.append(L[i])   # N is a list of lists of at least 3 primes with the same 4 digits N = [] for m in M: N.append([]) for p in P: s = p.digits() s.sort() if s in M: r = M.index(s) N[r].append(p)   # Use a Sage combinations iterator to examine sets of 3 primes for n in N: X = combinations_iterator(n, 3) for x in X: if x-x == x-x: print x elapsed = time.time() - start   print "time elapsed: %s seconds" % elapsed

When executed, we get the following result.

[1487, 4817, 8147] [2969, 6299, 9629] time elapsed: 0.0485339164734 seconds

### Motivation

Many interesting computational problems, such as those on Project Euler require that one find the sum of proper divisors of a given integer. I had a fairly crude brute-force method for doing this, and was subsequently emailed a comment by Bjarki Ágúst Guðmundsson who runs the site www.mathblog.dk. He pointed me in the direction of this page and provided some sample code illustrating how such an approach runs asymptotically faster than the approach I had been taking. Awesome! I’m going to expand on that a bit here, providing some mathematical proofs behind the claims and providing code for those who may want to take advantage of this.

### The Mathematics Behind It All

Let the function $\sigma(n)$ be the sum of divisors for a positive integer $n$. For example,
$\sigma(6)=1+2+3+6=12.$

It should seem obvious that for any prime $p$ we have $\sigma(p)=1+p$. What about powers of primes? Let $\alpha\in\mathbb{Z}_+$, and then
$\sigma(p^\alpha)=1+p+p^2+\cdots+p^\alpha.$

We’d like to write that in a closed form, i.e., without the “$+\cdots+$”. We use a standard series trick to do that.

\begin{align} \sigma(p^\alpha) &= 1+p+p^2+\cdots+p^\alpha\cr p\sigma(p^\alpha) &= p+p^2+p^3+\cdots+p^{\alpha+1}\cr p\sigma(p^\alpha)-\sigma(p^\alpha) &= (p+p^2+\cdots+p^{\alpha+1})-(1+p+\cdots+p^\alpha)\cr p\sigma(p^\alpha)-\sigma(p^\alpha) &= p^{\alpha+1}-1\cr (p-1)\sigma(p^\alpha) &= p^{\alpha+1}-1\cr \sigma(p^\alpha) &=\frac{p^{\alpha+1}-1}{p-1}.\end{align}

That solves the problem of finding the sum of divisors for powers of primes. It would be nice if we could show that $\sigma$ is multiplicative on powers of primes, i.e., that $\sigma(p_1^{\alpha_1}p_2^{\alpha_2})=\sigma(p_1^{\alpha_1})\sigma(p_2^{\alpha_2})$. We’ll prove that this is the case, and solve the problem in general along the way.

Proposition: The function $\sigma$ is multiplicative on powers of primes.

Proof: Let $n$ be a positive integer written (uniquely, by the fundamental theorem of arithmetic) as
$n=\prod_{i=1}^m p_i^{\alpha_i}$
for $m$ distinct primes $p_i$ with $\alpha_i\in\mathbb{Z}_+$. Any divisor $k$ of $n$ then has the form
$k=\prod_{i=1}^m p_i^{\beta_i}$
where each $\beta_i$ satisfies $0\le\beta_i\le\alpha_i$. Then
$\sigma(n)=\sigma\left(\prod_{i=1}^m p_i^{\alpha_i}\right)$
is the sum of all divisors $k$ of $n$ and can be written by summing over all possible combinations of the exponents $\beta_i$. There are $\prod_{i=1}^m \alpha_i$ combinations, and we can form their sum and simplify it as follows.
\begin{align}\sigma(n) &= \sum_{1\le i\le m,\ 0\le\beta_i\le\alpha_i}p_1^{\beta_i}p_2^{\beta_2}\cdots p_m^{\beta_m}\cr &= \sum_{\beta_1=0}^{\alpha_1}p_1^{\beta_1}\left(\sum_{2\le i\le m,\ 0\le\beta_i\le\alpha_i}p_2^{\beta_2}p_3^{\beta_3}\cdots p_m^{\beta_m}\right)\cr &= \sum_{\beta_1=0}^{\alpha_1}p_1^{\beta_1}\sum_{\beta_2=0}^{\alpha_2}p_2^{\beta_2}\left(\sum_{3\le i\le m,\ 0\le\beta_i\le\alpha_i}p_3^{\beta_3}p_4^{\beta_4}\cdots p_m^{\beta_m}\right) \cr &= \vdots\cr &=\sum_{\beta_1=0}^{\alpha_1}p_1^{\beta_1}\sum_{\beta_2=0}^{\alpha_2}p_2^{\beta_2}\sum_{\beta_3=0}^{\alpha_3}p_3^{\beta_3}\ \cdots\ \sum_{\beta_m=0}^{\alpha_m}p_m^{\beta_m}\cr &=\sigma(p_1^{\alpha_1})\sigma(p_2^{\alpha_2})\cdots\sigma(p_m^{\alpha_m}).\end{align}
This completes the proof. Q.E.D.

Thus, we now have a formula for the sum of divisors of an arbitrary positive integer $n$ using the factorization of $n$. Namely,
$\sigma(n)=\sigma\left(\prod_{i=1}^m p_i^{\alpha_i}\right)=\prod_{i=1}^m\left(\frac{p_i^{\alpha_i+1}-1}{p_i-1}\right).$

This is something I use quite a bit for various problems and programming exercises, so I figured I could post it here. It’s a basic post that isn’t advanced at all, but that doesn’t mean that the implementation given below won’t save work for others. The idea is to create a list of primes in C by malloc’ing a sieve, then malloc’ing a list of specific length based on that sieve. The resulting list contains all the primes below a given limit (defined in the code). The first member of the list is an integer representing the length of the list.

#include <stdio.h> #include <stdlib.h> #include <malloc.h>   #define bool _Bool   static unsigned long prime_limit = 1000000;   unsigned long sqrtld(unsigned long N) { int b = 1; unsigned long res,s; while(1<<b<N) b+= 1; res = 1<<(b/2 + 1); for(;;) { s = (N/res + res)/2; if(s>=res) return res; res = s; } }   unsigned long * make_primes(unsigned long limit) { unsigned long *primes; unsigned long i,j; unsigned long s = sqrtld(prime_limit); unsigned long n = 0; bool *sieve = malloc((prime_limit + 1) * sizeof(bool)); sieve = 0; sieve = 0; for(i=2; i<=prime_limit; i++) sieve[i] = 1; j = 4; while(j<=prime_limit) { sieve[j] = 0; j += 2; } for(i=3; i<=s; i+=2) { if(sieve[i] == 1) { j = i * 3; while(j<=prime_limit) { sieve[j] = 0; j += 2 * i; } } } for(i=2;i<=prime_limit;i++) if(sieve[i]==1) n += 1; primes = malloc((n + 1) * sizeof(unsigned long)); primes = n; j = 1; for(i=2;i<=prime_limit;i++) if(sieve[i]==1) { primes[j] = i; j++; } free(sieve); return primes; }     int main(void) {   unsigned long * primes = make_primes(prime_limit);   printf("There are %ld primes <= %ld\n",primes,prime_limit);     free(primes); return 0; }

Say one wanted to form a list of all primes below 1,000,000. That’s what the above program does by default, since “prime_limit = 1000000.” If one compiles this and executes, you would get something like what follows. The timing is relatively respectable.

$gcc -O3 -o prime-sieve prime-sieve.c$ time ./prime-sieve There are 78498 primes <= 1000000   real 0m0.008s user 0m0.004s sys 0m0.000s

The code is linked here: prime-sieve.c

The following code creates a linked list with 20 random elements, and then reverses the list.

/* reverse-linked-list.cpp Jason B. Hill (jason@jasonbhill.com) reverse a linked list in C++ */   #include <iostream> #include <cstdlib>   using namespace std;   // Node class class Node {   public: int data; Node* next;   // create a new node Node(int x, Node* addr) { data = x; next = addr; } };   // LinkedList class class LinkedList { private: Node* head;   public:   // initialize a linked list with a NULL head LinkedList() { head = NULL; }   // determine if a list is empty bool is_empty() { if(head == NULL) return 1; else return 0; }   // add an item to a linked list void node_add(int val) { if(head == NULL) head = new Node(val, NULL); else { Node* n = head; while(n->next != NULL) n = n->next; n->next = new Node(val, NULL); } }   // return the length of a linked list unsigned int length() { if(head == NULL) return 0; if(head->next == NULL) return 1; unsigned int len = 1; Node* n = head; while(n->next != NULL) { n = n->next; len++; } return len; }   // print a linked list void print() { cout<<"["; if(head != NULL) { if(head->next == NULL) cout<<head->data; else { Node* n = head; while(n->next != NULL) { cout<<n->data<<","; n = n->next; } cout<<n->data; } } cout<<"]\n"; }   //reverse a linked list void reverse() { if(head != NULL and head->next != NULL) { Node* m = head; Node* n = head->next; Node* t = NULL; if(n->next != NULL) t = n->next; head->next = NULL; while(t->next != NULL) { n->next = m; m = n; n = t; t = t->next; } n->next = m; head = t; t->next = n; } } };   int main(void){   LinkedList L; unsigned int i;   srand(time(NULL)); for(i=0;i<20;i++) L.node_add(rand() % 100);   L.print();   L.reverse();   L.print();     return(0); }

This can be compiled (using gcc/g++) using:

g++ reverse-linked-list.cpp -o reverse-linked-list

An example execution is:

### Problem

Euler published the remarkable quadratic formula:

$n^2+n+41$

It turns out that the formula will produce 40 primes for the consecutive values $n=0$ to $39$. However, when $n=40$, $40^2+40+41=40(40+1)+41$ is divisible by 41, and certainly when $n=41$, $41^2+41+41$ is clearly divisible by 41.

Using computers, the incredible formula $n^2-79n+1601$ was discovered, which produces 80 primes for the consecutive values $n=0$ to $79$. The produce of the coefficients, $-79$ and $1601$ is $-126479$.

Considering quadratics of the form:

$n^2+an+b$ where $|a| where$|n|$is the modulus/absolute value of$n$, e.g.,$|11|=11$and$|-4|=4$. Find the product of the coefficients$a$and$b$, for the quadratic expression that produces the maximum number of primes for consecutive values of$n$, starting with$n=0$. ### Sage Solution Some observations: • Clearly,$b$must be a prime or a negative prime, since$n=0$must result in a prime. • We can refine a lower bound on$b$as follows: If a given combination$(a,b)$results in$m$consecutive primes, then clearly$b>m$, since otherwise we would obtain a factor of$m$in the given polynomial. Also, when$a$is negative, we must have$b>-(n^2+an)$, since the prime values$n^2+an+b$must be positive. Since we know that$n^2+n+41$returns 40 primes, we know that any interesting values for$a$and$b$must then satisfy$b>-(40^2+40a)$. Given these observations, I’ll write a routine in Sage. import time start = time.time() P = prime_range(1000) a_max, b_max, c_max = 0, 0, 0 for a in range(-1000,1001): for b in P: if b &lt; -1600-40*a or b &lt; c_max: continue c, n = 0, 0 while is_prime(n**2 + a*n + b): c += 1 n += 1 if c &gt; c_max: a_max, b_max, c_max = a, b, c prod = a_max * b_max elapsed = time.time() - start print "a=%s, b=%s, longest sequence = %s, prod=%s\nfound in %s seconds"\ % (a_max, b_max, c_max, prod, elapsed) When executed, we obtain the following. a=-61, b=971, longest sequence = 71, prod=-59231 found in 8.53578400612 seconds It’s entirely possible that we’re being a bit redundant with the integer values that we’re throwing at Sage’s “is_prime” function. We’ll cache a list of primes within the range that we’re considering. (I came up with the number 751000 below in the Python section.) import time start = time.time() P = prime_range(751000) L = [False] * 751000 for p in P: L[p] = True a_max, b_max, c_max = 0, 0, 0 for a in range(-1000,1001): for b in range(1,1001): if L[b] is False: continue if b &lt; -1600-40*a or b &lt; c_max: continue c, n = 0, 0 while L[n**2 + a*n + b] is True: c += 1 n += 1 if c &gt; c_max: a_max, b_max, c_max = a, b, c prod = a_max * b_max elapsed = time.time() - start print "a=%s, b=%s, longest sequence = %s, prod=%s\nfound in %s seconds"\ % (a_max, b_max, c_max, prod, elapsed) This indeed executes more quickly, as is seen in the output. a=-61, b=971, longest sequence = 71, prod=-59231 found in 1.39028286934 seconds ### Python Solution Of course, Sage’s prime/factorization functionality isn’t available in Python, and so we’re going to need to create a prime sieve and store primes in some form. One of the things I need to know is how large the values$n^2+an+b$can be. If we were to assume that the maximum value of the length of a prime sequence is 500 (which is an overestimate as we now know), then we’d have the maximum value of the quadratic given as: 500**2 + 1000 * 500 + 1000 751000 Let’s create a prime sieve that will return a list of all primes below a given number. def primes_xrange(stop): primes = [True] * stop primes, primes = [False] * 2 L = [] for ind, val in enumerate(primes): if val is True: primes[ind*2::ind] = [False] * (((stop - 1)//ind) - 1) L.append(ind) return L Let’s verify that it works. primes_xrange(20) [2, 3, 5, 7, 11, 13, 17, 19] How long does this function take to form all of the primes under 751000? time P = primes_xrange(751000) Time: CPU 0.39 s, Wall: 0.39 s Now, we’ll take the Sage routine from above and write it completely in Python. Here, I’m using only the sieve list portion of the primes_xrange function I just created. As above, I’m just checking to see if values in that list are True or False, corresponding to primes and composite numbers. This should take a bit longer to run than the Sage version above. import time def primes_xrange(stop): primes = [True] * stop primes, primes = [False] * 2 for ind, val in enumerate(primes): if val is True: primes[ind*2::ind] = [False] * (((stop - 1)//ind) - 1) return primes start = time.time() P = primes_xrange(751000) a_max, b_max, c_max = 0, 0, 0 for a in range(-1000,1001): for b in range(1,1001): if P[b] is False: continue if b &lt; -1600-40*a or b &lt; c_max: continue c, n = 0, 0 while P[n**2 + a*n + b] is True: c += 1 n += 1 if c &gt; c_max: a_max, b_max, c_max = a, b, c prod = a_max * b_max elapsed = time.time() - start print "a=%s, b=%s, longest sequence = %s, prod=%s\nfound in %s seconds"\ % (a_max, b_max, c_max, prod, elapsed) And it does run slightly slow compared to the Sage version, as expected. a=-61, b=971, longest sequence = 71, prod=-59231 found in 1.73107814789 seconds ### Cython Solution I’m going to take roughly the same approach as with the Python solution, but I’m going to malloc a C array instead of using the Python list. That should make things a bit more efficient. Before we do that, let’s see how much faster our routine runs if we just slap a “%cython” on the first line. %cython import time def primes_xrange(stop): primes = [True] * stop primes, primes = [False] * 2 for ind, val in enumerate(primes): if val is True: primes[ind*2::ind] = [False] * (((stop - 1)//ind) - 1) return primes start = time.time() P = primes_xrange(751000) a_max, b_max, c_max = 0, 0, 0 for a in range(-1000,1001): for b in range(1,1001): if P[b] is False: continue if b &lt; -1600-40*a or b &lt; c_max: continue c, n = 0, 0 while P[n**2 + a*n + b] is True: c += 1 n += 1 if c &gt; c_max: a_max, b_max, c_max = a, b, c prod = a_max * b_max elapsed = time.time() - start print "a=%s, b=%s, longest sequence = %s, prod=%s\nfound in %s seconds"\ % (a_max, b_max, c_max, prod, elapsed) We find that this does improve the timing, and our Python code (which hasn’t been modified at all) now runs more quickly than the Sage code. a=-61, b=971, longest sequence = 71, prod=-59231 found in 1.02033305168 seconds Rewriting to take advantage of Cython’s C datatypes, how much faster can we obtain the result? %cython import time from libc.stdlib cimport malloc, free start = time.time() cdef long i, j cdef bint *primes = malloc(751000 * sizeof(bint)) primes, primes = 0, 0 i = 2 while i &lt; 751000: primes[i] = 1 i += 1 i = 2 while i &lt; 751000: if primes[i] == 1: j = 2 * i while j &lt; 751000: primes[j] = 0 j += i i += 1 cdef long long a, b, c, t cdef long long pol, n cdef long a_max, b_max, c_max, prod a_max, b_max, c_max = 0, 0, 0 a = -1000 while a &lt; 1001: b = 2 while b &lt; 1001: if primes[b] == 0: b += 1 continue t = -1600-40*a if b &lt; t or b &lt; c_max: b += 1 continue c, n = 0, 0 pol = n**2 + a*n + b while primes[pol] == 1: c += 1 n += 1 pol = n**2 + a*n + b if c &gt; c_max: a_max, b_max, c_max = a, b, c b += 1 a += 1 prod = a_max * b_max free(primes) elapsed = time.time() - start print "a=%s, b=%s, longest sequence = %s, prod=%s\nfound in %s seconds"\ % (a_max, b_max, c_max, prod, elapsed) Now, this should run much more quickly. a=-61, b=971, longest sequence = 71, prod=-59231 found in 0.0381400585175 seconds Thus, the Sage version runs 1.245 times as fast as the Python version. The Cython version runs 45.387 times as fast as the Python version and 36.452 times as fast as the Sage version. ### Problem Let$d(n)$be defined as the sum of proper divisors of$n$(numbers less than$n$which divide evenly into$n$). If$d(a) = b$and$d(b) = a$, where$a\neq b$, then$a$and$b$are an amicable pair and each of a and b are called amicable numbers. For example, the proper divisors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110; therefore$d(220) = 284$. The proper divisors of 284 are 1, 2, 4, 71 and 142; so$d(284) = 220$. Evaluate the sum of all the amicable numbers under 10000. ### Python Solution I used the idea of creating a list and populating that list with values corresponding to the sum of divisors of numbers, and then I walk through the list and do some index and value checking. import time def sum_divisors(n): s = 0 for i in range(1,n): if n % i == 0: s += i return s def amicable_pairs_xrange(low,high): L = [sum_divisors(i) for i in range(low,high + 1)] pairs = [] for i in range(high - low + 1): ind = L[i] if i + low < ind and low <= ind and ind <= high and L[ind - low] == i + low: pairs.append([i+low,ind]) return pairs def sum_pairs(pairs): return sum([sum(pair) for pair in pairs]) start = time.time() ans = sum_pairs(amicable_pairs_xrange(1,10000)) elapsed = time.time() - start print("%s found in %s seconds") % (ans,elapsed) This returns the correct result. 31626 found in 7.86725997925 seconds ### Cython Solution One of the things I love about using Cython inside the Sage Notebook environment is that one can simply place a Cython statement inside Python code and often achieve some sort of speedup. All I’ve done in this first example is to do just that. I haven’t change the Python code at all yet. %cython import time def sum_divisors(n): s = 0 for i in range(1,n): if n % i == 0: s += i return s def amicable_pairs_xrange(low,high): L = [sum_divisors(i) for i in range(low,high + 1)] pairs = [] for i in range(high - low + 1): ind = L[i] if i + low < ind and low <= ind and ind <= high and L[ind - low] == i + low: pairs.append([i+low,ind]) return pairs def sum_pairs(pairs): return sum([sum(pair) for pair in pairs]) start = time.time() ans = sum_pairs(amicable_pairs_xrange(1,10000)) elapsed = time.time() - start print("%s found in %s seconds") % (ans,elapsed) This runs faster than the original, and that blows my mind. 31626 found in 4.98208904266 seconds We can improve on this obviously, by rewriting the involved functions using C datatypes. %cython import time cdef sum_divisors(unsigned int n): cdef unsigned int s = 0, i = 1 while i < n: if n % i == 0: s += i i += 1 return s cdef sum_amicable_pairs(unsigned int low, unsigned int high): cdef unsigned int a = low, b, sum = 0 while a <= high: b = sum_divisors(a) if b > a and sum_divisors(b) == a: sum += a + b a += 1 return sum start = time.time() ans = sum_amicable_pairs(1,10000) elapsed = time.time() - start print("%s found in %s seconds") % (ans,elapsed) We do achieve somewhat significant speedup now. 31626 found in 1.06677889824 seconds ### C Solution If I take the fast Cython version and rewrite it in C, I don’t get much better performance. #include <stdio.h> #include <stdlib.h> unsigned int sum_divisors(unsigned int n) { unsigned int s = 0, i = 1; while(i < n) { if(n % i == 0) s = s + i; i++; } return s; } unsigned int sum_amicable_pairs(unsigned int low, unsigned int high) { unsigned int a = low, b, sum = 0; while(a <= high) { b = sum_divisors(a); if(b > a && sum_divisors(b) == a) sum = sum + a + b; a++; } return sum; } int main(int argc, char **argv) { unsigned int ans = sum_amicable_pairs(1,10000); printf("result: %d\n",ans); return 0; } The result, compiled with heavy optimization, is only slightly faster than the Cython code. $ gcc -O3 -o problem-21 problem-21.c $time ./problem-21 result: 31626 real 0m1.032s About the only way to really improve this, beyond restructuring the algorithm, is to use OpenMP to run the process in parallel. ### C with OpenMP A parallel version using shared memory via OpenMP looks something like this. (I’ve rewritten it slightly.) #include <stdio.h> #include <stdlib.h> #include <omp.h> unsigned int sum_divisors(unsigned int n) { unsigned int s = 0, i = 1; while(i < n) { if(n % i == 0) s = s + i; i++; } return s; } int main(int argc, char **argv) { unsigned int a, b, nthreads, result = 0; #pragma omp parallel nthreads = omp_get_num_threads(); printf("no of threads %d\n", nthreads); #pragma omp parallel for reduction(+:result) private(b) for(a=0; a < 10000; a++) { b = sum_divisors(a); if(b > a && sum_divisors(b) == a) result = result + a + b; } printf("result: %d\n",result); return 0; } Now, we get much better performance. I’m running this on a 16-core machine, where 3 of the cores are currently being used. $ gcc -fopenmp -O3 -o problem-21-openmp problem-21-openmp.c $time ./problem-21-openmp no of threads 16 result: 31626 real 0m0.140s Running this command in the terminal (I ran it as root) managed to move the window buttons back to the right and arrange them in the correct order. gconftool -s /apps/metacity/general/button_layout -t string menu:minimize,maximize,close$n!$means$n\times(n-1)\times\cdots\times 3\times 2\times 1$. For example,$10!=10\times 9\times\cdots\times 2\times 1=3628800$and the sum of the digits in the number$10!$is 27. Find the sum of the digits in$100!$. ### Sage Solution There is a single-line solution in Sage. I’ve added a bit more here just to determine the timing. import time start = time.time() s = sum(factorial(100).digits()) elapsed = time.time() - start print "Solution %s found after %s seconds" % (s,elapsed) When executed, we find the following. Solution 648 found after 0.000201940536499 seconds ### Python Solution Without the built-in functions of Sage, we can build a factorial function and use some string functionality in Python to achieve the same result. import time def factorial(n): if n == 0: return 1 else: return n * factorial(n - 1) def sum_digits(n): return sum([int(i) for i in str(n)]) start = time.time() sumfac = sum_digits(factorial(100)) elapsed = time.time() - start print("%s found in %s seconds") % (sumfac,elapsed) When executed, the same result is returned. 648 found in 0.000312805175781 seconds We could also compute the factorial using a reduced lambda function as in the following code. #!/usr/bin/python import time def factorial(n): return reduce(lambda x,y: x*y, range(1,n+1,1)) def sum_digits(n): return sum([int(i) for i in str(n)]) start = time.time() sumfac = sum_digits(factorial(100)) elapsed = time.time() - start print("%s found in %s seconds") % (sumfac,elapsed) Problem: You are given the following information, but you may prefer to do some research for yourself. • 1 Jan 1900 was a Monday. • Thirty days has September, April, June and November. All the rest have thirty-one, Saving February alone, Which has twenty-eight, rain or shine. And on leap years, twenty-nine. • A leap year occurs on any year evenly divisible by 4, but not on a century unless it is divisible by 400. How many Sundays fell on the first of the month during the twentieth century (1 Jan 1901 to 31 Dec 2000)? ### Approach There are several different ways to approach this. The easiest, I think, is to use the Gaussian formula for day of week. It is a purely mathematical formula that I have encoded in the following Python code. ### Python Solution import time from math import floor """ Gaussian algorithm to determine day of week """ def day_of_week(year, month, day): """ w = (d+floor(2.6*m-0.2)+y+floor(y/4)+floor(c/4)-2*c) mod 7 Y = year - 1 for January or February Y = year for other months d = day (1 to 31) m = shifted month (March = 1, February = 12) y = last two digits of Y c = first two digits of Y w = day of week (Sunday = 0, Saturday = 6) """ d = day m = (month - 3) % 12 + 1 if m &gt; 10: Y = year - 1 else: Y = year y = Y % 100 c = (Y - (Y % 100)) / 100 w = (d + floor(2.6 * m - 0.2) + y + floor(y/4) + floor(c/4) - 2*c) % 7 return int(w) """ Compute the number of months starting on a given day of the week in a century """ def months_start_range(day,year_start,year_end): total = 0 for year in range(year_start, year_end + 1): for month in range(1,13): if day_of_week(year, month, 1) == day: total += 1 return total start = time.time() total = months_start_range(0,1901,2000) elapsed = time.time() - start print("%s found in %s seconds") % (total,elapsed) This returns the correct result. 171 found in 0.0681998729706 seconds That will run faster if executed directed in Python. Remember, I’m using the Sage notebook environment to execute most Python here. I’m also writing Cython in the same environment. ### Cython Solution There is a nearly trivial rewriting to Cython of the above Python code. %cython import time from math import floor """ Gaussian algorithm to determine day of week """ cdef day_of_week(int year, int month, int day): """ w = (d+floor(2.6*m-0.2)+y+floor(y/4)+floor(c/4)-2*c) mod 7 Y = year - 1 for January or February Y = year for other months d = day (1 to 31) m = shifted month (March = 1, February = 12) y = last two digits of Y c = first two digits of Y w = day of week (Sunday = 0, Saturday = 6) """ cdef int d = day cdef int m = (month - 3) % 12 + 1 cdef int Y if m &gt; 10: Y = year - 1 else: Y = year y = Y % 100 cdef int c = (Y - (Y % 100)) / 100 cdef double w w = (d + floor(2.6 * m - 0.2) + y + floor(y/4) + floor(c/4) - 2*c) % 7 return int(w) """ Compute the number of months starting on a given day of the week in range of years """ cdef months_start_range(int day, int year_start,int year_end): cdef unsigned int total = 0 cdef int year, month for year in range(year_start, year_end + 1): for month in range(1,13): if day_of_week(year, month, 1) == day: total += 1 return total start = time.time() total = months_start_range(0,1901,2000) elapsed = time.time() - start print("%s found in %s seconds") % (total,elapsed) The code is a bit longer, but it executes much faster. 171 found in 0.00387215614319 seconds The Cython code runs roughly 18 times faster. ### C Solution The Cython code was used as a model to create more efficient C code. The only issue here is in maintaining the correct datatypes (not too hard, but compared to Cython it is a pain). #include <stdio.h> #include <stdlib.h> #include <math.h> int day_of_week(int year, int month, int day) { // Using the Gaussian algorithm int d = day; double m = (double) ((month - 3) % 12 + 1); int Y; if(m > 10) Y = year - 1; else Y = year; int y = Y % 100; int c = (Y - (Y % 100)) / 100; int w = ((d+(int)floor(2.6*m-0.2)+y+ y/4 + c/4 -2*c))%7; return w; } long months_start_range(int day, int year_start, int year_end) { unsigned long total = 0; int year, month; for(year = year_start; year < year_end; year++) { for(month = 1; month <= 12; month++) { if(day_of_week(year, month, 1)==day) total++; } } return total; } int main(int argc, char **argv) { int iter = 0; long total; while(iter < 100000) { total = months_start_range(0,1901,2000); iter++; } printf("Solution: %ld\n",total); return 0; } Notice that this executes the loop 100,000 times, as I’m trying to get a good idea of what the average runtime is. We compile with optimization and the [[[-lm]]] math option. We get the following result. gcc -O3 -o problem-19 problem-19.c -lm$ time ./problem-19 Solution: 171   real 0m6.240s

The C code runs roughly 62 times as fast as the Cython and roughly 1124 times as fast as the Python. Each iteration executes in about 6.2000e-5 seconds.