Thursday, July 7, 2011

Pickling python classes

Ok, this is the first thing in Python I have found to be annoying and nonintuitive. When you pickle a class object you need to make sure that the module for the class is explicitly in scope and imported.

I kept banging my head against this problem and only understood it after looking at this guy's blog.

When you pickle a class object included in the pickle file is a coded import statement telling the interpreter which module to import to look for the definition of the class. This leads to the following gotcha:

The following code will define a class, instantiate it and pickle it without problems

#File class_a.py
import cPickle

class A:
  def __init__(self):
    self.x = 22

if __name__ == "__main__":
  a = A()
  cPickle.dump(a, open('obja.pkl','wb'), protocol=-1)
  print cPickle.dumps(a, protocol=0)

Notice the stringification of the class : it begins with
i__main__

Now, we run the following code to load the object from the pickle:
import cPickle
m = cPickle.load(open('obja.pkl'))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/Users/kghose/Research/2011/Papers/SpatialIntegration/Python/Sandbox/ in ()
----> 1 m = cPickle.load(open('obja.pkl'))

AttributeError: 'FakeModule' object has no attribute 'A'

Whaaa?

Now, say we change the original file by explicitly importing our module and instantiating the object as a class from that module (as would happen if our pickling code was in a different file from the class definition file):

#File class_a.py
import cPickle

class A:
  def __init__(self):
    self.x = 22

if __name__ == "__main__":
  import class_a
  a = class_a.A()
  cPickle.dump(a, open('obja.pkl','wb'), protocol=-1)
  print cPickle.dumps(a, protocol=0)

Now notice that the serialization begins with
iclass_a
which is our module name.

And when we try to load it, we don't get any errors.

No comments:

Post a Comment