Skip to main content

Pickling python classes

Ok, this is the first thing in Python I have found to be annoying and nonintuitive. When you pickle a class object you need to make sure that the module for the class is explicitly in scope and imported.

I kept banging my head against this problem and only understood it after looking at this guy's blog.

When you pickle a class object included in the pickle file is a coded import statement telling the interpreter which module to import to look for the definition of the class. This leads to the following gotcha:

The following code will define a class, instantiate it and pickle it without problems

#File class_a.py
import cPickle

class A:
  def __init__(self):
    self.x = 22

if __name__ == "__main__":
  a = A()
  cPickle.dump(a, open('obja.pkl','wb'), protocol=-1)
  print cPickle.dumps(a, protocol=0)

Notice the stringification of the class : it begins with
i__main__

Now, we run the following code to load the object from the pickle:
import cPickle
m = cPickle.load(open('obja.pkl'))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/Users/kghose/Research/2011/Papers/SpatialIntegration/Python/Sandbox/ in ()
----> 1 m = cPickle.load(open('obja.pkl'))

AttributeError: 'FakeModule' object has no attribute 'A'

Whaaa?

Now, say we change the original file by explicitly importing our module and instantiating the object as a class from that module (as would happen if our pickling code was in a different file from the class definition file):

#File class_a.py
import cPickle

class A:
  def __init__(self):
    self.x = 22

if __name__ == "__main__":
  import class_a
  a = class_a.A()
  cPickle.dump(a, open('obja.pkl','wb'), protocol=-1)
  print cPickle.dumps(a, protocol=0)

Now notice that the serialization begins with
iclass_a
which is our module name.

And when we try to load it, we don't get any errors.

Comments

Popular posts from this blog

A note on Python's __exit__() and errors

Python's context managers are a very neat way of handling code that needs a teardown once you are done. Python objects have do have a destructor method ( __del__ ) called right before the last instance of the object is about to be destroyed. You can do a teardown there. However there is a lot of fine print to the __del__ method. A cleaner way of doing tear-downs is through Python's context manager , manifested as the with keyword. class CrushMe: def __init__(self): self.f = open('test.txt', 'w') def foo(self, a, b): self.f.write(str(a - b)) def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.f.close() return True with CrushMe() as c: c.foo(2, 3) One thing that is important, and that got me just now, is error handling. I made the mistake of ignoring all those 'junk' arguments ( exc_type, exc_val, exc_tb ). I just skimmed the docs and what popped out is that you need to return True or

Using adminer on Mac OS X

adminer is a nice php based sqlite manager. I prefer the firefox plugin "sqlite manager" but it currently has a strange issue with FF5 that basically makes it unworkable, so I was looking for an alternative to tide me over. I really don't want apache running all the time on my computer and don't want people browsing to my computer, so what I needed to do was: Download the adminer php script into /Library/WebServer/Documents/ Change /etc/apache2/httpd.conf to allow running of php scripts (uncomment the line that begins: LoadModule php5_module Start the apache server: sudo apachectl -k start Operate the script by going to localhost Stop the server: sudo apachectl -k stop