Problem: The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17. Find the sum of all the primes below two million.

Sage Solution

There is a direct Sage solution here. We’ll create a list of all primes under two million and sum over that list. We’ll also throw a time component in front so we know how long Sage takes to do the computations.

time sum(prime_range(2e6))

Executing this gives the following result.

142913828922
Time: CPU 0.04 s, Wall: 0.05 s

Python Solution

We’ll use our prime sieve from Project Euler Problem 7 and simply rewrite things a bit.

import time
 
def prime_sum(n):
    if n < 2: return 0
    if n == 2: return 2
    if n % 2 == 0: n += 1
    primes = [True] * n
    primes[0],primes[1] = [None] * 2
    sum = 0
    for ind,val in enumerate(primes):
        if val is True and ind > n ** 0.5 + 1:
            sum += ind
        elif val is True:
            primes[ind*2::ind] = [False] * (((n - 1)//ind) - 1)
            sum += ind
    return sum
 
start = time.time()
sum = prime_sum(2000000)
elapsed = (time.time() - start)
 
print "found %s in %s seconds" % (sum,elapsed)

This executes in under a second:

found 142913828922 in 0.838065862656 seconds

Cython Solution

What is a bit amusing is that we can do a trivial rewriting to Cython and achieve a speedup.

%cython
 
import time
 
cdef prime_sum(unsigned long n):
    if n < 2: return 0
    if n == 2: return 2
    if n % 2 == 0: n -= 1
    primes = [True] * n
    primes[0],primes[1] = [None] * 2
    sum = 0
    for ind,val in enumerate(primes):
        if val is True and ind > n ** 0.5 + 1:
            sum += ind
        elif val is True:
            primes[ind*2::ind] = [False] * (((n - 1)//ind) - 1)
            sum += ind
    return sum
 
start = time.time()
sum = prime_sum(2000000)
elapsed = (time.time() - start)
 
print "found %s in %s seconds" % (sum,elapsed)

When executed, we get:

found 142913828922 in 0.277922153473 seconds

The Cython code (which has minimal modifications) is about 3 times as fast as the Python code. Let’s see how much we can improve the Cython version. We’ll rewrite the list of integers as a pointer-based C array of Boolean values. This should speed things up a bit.

%cython
 
import time
from libc.stdlib cimport malloc, free
 
# a function to sum all prime numbers below a given number
cdef prime_sum(unsigned long n):
 
    if n < 2: return 0
    if n == 2: return 2
    # if n is even, use next lowest number
    if n % 2 == 0: n -= 1
 
    cdef bint *is_prime = <bint *>malloc(n * sizeof(bint))
 
    # set all to prime and then sieve
    cdef unsigned long int i,j,sum = 0
    i = 0
    while i < n:
        is_prime[i] = 1
        i += 1
    #for i in range(n): is_prime[i] = 1
    is_prime[0],is_prime[1] = 0,0
    i = 2
    while i < n ** 0.5 + 1:
        if is_prime[i] == 1:
            # add this to the sum
            sum += i
            # remove all multiples of i from prime list
            j = i ** 2
            while j <= n:
                is_prime[j] = 0
                j += i
        i += 1
    while i <= n:
        if is_prime[i]: sum += i
        i += 1
    free(is_prime)
 
    return sum
 
start = time.time()
sum = prime_sum(2000000)
elapsed = (time.time() - start)
 
print "found %s in %s seconds" % (sum,elapsed)

When executed, we now get the following.

found 142913828922 in 0.0436608791351 seconds

The Cython executes roughly 19 times faster than the original Python code.


Comments are closed