Skip to main content

Pickling python classes

Ok, this is the first thing in Python I have found to be annoying and nonintuitive. When you pickle a class object you need to make sure that the module for the class is explicitly in scope and imported.

I kept banging my head against this problem and only understood it after looking at this guy's blog.

When you pickle a class object included in the pickle file is a coded import statement telling the interpreter which module to import to look for the definition of the class. This leads to the following gotcha:

The following code will define a class, instantiate it and pickle it without problems

#File class_a.py
import cPickle

class A:
  def __init__(self):
    self.x = 22

if __name__ == "__main__":
  a = A()
  cPickle.dump(a, open('obja.pkl','wb'), protocol=-1)
  print cPickle.dumps(a, protocol=0)

Notice the stringification of the class : it begins with
i__main__

Now, we run the following code to load the object from the pickle:
import cPickle
m = cPickle.load(open('obja.pkl'))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/Users/kghose/Research/2011/Papers/SpatialIntegration/Python/Sandbox/ in ()
----> 1 m = cPickle.load(open('obja.pkl'))

AttributeError: 'FakeModule' object has no attribute 'A'

Whaaa?

Now, say we change the original file by explicitly importing our module and instantiating the object as a class from that module (as would happen if our pickling code was in a different file from the class definition file):

#File class_a.py
import cPickle

class A:
  def __init__(self):
    self.x = 22

if __name__ == "__main__":
  import class_a
  a = class_a.A()
  cPickle.dump(a, open('obja.pkl','wb'), protocol=-1)
  print cPickle.dumps(a, protocol=0)

Now notice that the serialization begins with
iclass_a
which is our module name.

And when we try to load it, we don't get any errors.

Comments

Popular posts from this blog

A note on Python's __exit__() and errors

Python's context managers are a very neat way of handling code that needs a teardown once you are done. Python objects have do have a destructor method ( __del__ ) called right before the last instance of the object is about to be destroyed. You can do a teardown there. However there is a lot of fine print to the __del__ method. A cleaner way of doing tear-downs is through Python's context manager , manifested as the with keyword. class CrushMe: def __init__(self): self.f = open('test.txt', 'w') def foo(self, a, b): self.f.write(str(a - b)) def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.f.close() return True with CrushMe() as c: c.foo(2, 3) One thing that is important, and that got me just now, is error handling. I made the mistake of ignoring all those 'junk' arguments ( exc_type, exc_val, exc_tb ). I just skimmed the docs and what popped out is that you need to return True or...

Store numpy arrays in sqlite

Use numpy.getbuffer (or sqlite3.Binary ) in combination with numpy.frombuffer to lug numpy data in and out of the sqlite3 database: import sqlite3, numpy r1d = numpy.random.randn(10) con = sqlite3.connect(':memory:') con.execute("CREATE TABLE eye(id INTEGER PRIMARY KEY, desc TEXT, data BLOB)") con.execute("INSERT INTO eye(desc,data) VALUES(?,?)", ("1d", sqlite3.Binary(r1d))) con.execute("INSERT INTO eye(desc,data) VALUES(?,?)", ("1d", numpy.getbuffer(r1d))) res = con.execute("SELECT * FROM eye").fetchall() con.close() #res -> #[(1, u'1d', <read-write buffer ptr 0x10371b220, size 80 at 0x10371b1e0>), # (2, u'1d', <read-write buffer ptr 0x10371b190, size 80 at 0x10371b150>)] print r1d - numpy.frombuffer(res[0][2]) #->[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] print r1d - numpy.frombuffer(res[1][2]) #->[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] Note that for work where data ty...