Friday, July 4, 2014

numpy arrays and iterators

Is it better to access a numpy array with an iterator or indexing?

import numpy

# This is how I've always done it - straight python idiomatic usage
def iterate1(m=100000):
  a = numpy.empty(m)
  sum = 0
  for n in a:
    sum += n
  return sum

# Didn't know about this one, but it is the recommended one in the docs
def iterate2(m=100000):
  a = numpy.empty(m)
  sum = 0
  for n in numpy.nditer(a):
    sum += n
  return sum

#This is C-like 
def loop(m=100000):
  a = numpy.empty(m)
  sum = 0
  for n in range(a.size):
    sum += a[n]
  return sum

And the shootout?

>>> %timeit iterate1()
10 loops, best of 3: 35.4 ms per loop
>>> %timeit iterate2()
10 loops, best of 3: 63.7 ms per loop
>>> %timeit loop()
10 loops, best of 3: 45.1 ms per loop


>>> %timeit iterate1(1e6)
1 loops, best of 3: 368 ms per loop
>>> %timeit iterate2(1e6)
1 loops, best of 3: 605 ms per loop
>>> %timeit loop(1e6)
1 loops, best of 3: 443 ms per loop

So the pythonic one seems to be the best

No comments:

Post a Comment