Use the highest protocol you have available - which will usually be binary. So as example:
Note that load figures out the protocol automatically...
import cPickleFor some reason, I thought the default was to use the most efficient protocol, but it isn't.
import pylab
z = pylab.random((10,10))
z ndarray 10x10: 100 elems, type `float64`, 800 bytes
len(cPickle.dumps(z)) -> 2371
len(cPickle.dumps(z, cPickle.HIGHEST_PROTOCOL)) -> 934
z = pylab.random((50,50))
z ndarray 50x50: 2500 elems, type `float64`, 20000 bytes
len(cPickle.dumps(z)) -> 55586
len(cPickle.dumps(z, cPickle.HIGHEST_PROTOCOL)) -> 20134
cPickle.dump?Hence my slow processing and bloated data files...
Type: builtin_function_or_method
Base Class:
String Form:
Namespace: Interactive
Docstring:
dump(obj, file, protocol=0) -- Write an object in pickle format to the given file.
See the Pickler docstring for the meaning of optional argument proto.
Note that load figures out the protocol automatically...
Comments
Post a Comment