The pickle module implements a basic but powerful algorithm
for ``pickling'' (a.k.a. serializing, marshalling or flattening)
nearly arbitrary Python objects. This is the act of converting
objects to a stream of bytes (and back: ``unpickling''). This is a
more primitive notion than persistency -- although pickle
reads and writes file objects, it does not handle the issue of naming
persistent objects, nor the (even more complicated) area of concurrent
access to persistent objects. The pickle module can
transform a complex object into a byte stream and it can transform the
byte stream into an object with the same internal structure. The most
obvious thing to do with these byte streams is to write them onto a
file, but it is also conceivable to send them across a network or
store them in a database. The module
shelve
Note: The pickle module is rather slow. A
reimplementation of the same algorithm in C, which is up to 1000 times
faster, is available as the
cPickle
Although the pickle module can use the built-in module
marshal
The data format used by pickle is Python-specific. This has
the advantage that there are no restrictions imposed by external
standards such as
XDR
By default, the pickle data format uses a printable ASCII representation. This is slightly more voluminous than a binary representation. The big advantage of using printable ASCII (and of some other characteristics of pickle's representation) is that for debugging or recovery purposes it is possible for a human to read the pickled file with a standard text editor.
A binary format, which is slightly more efficient, can be chosen by specifying a nonzero (true) value for the bin argument to the Pickler constructor or the dump() and dumps() functions. The binary format is not the default because of backwards compatibility with the Python 1.4 pickle module. In a future version, the default may change to binary.
The pickle module doesn't handle code objects, which the
marshal
For the benefit of persistency modules written using pickle, it
supports the notion of a reference to an object outside the pickled
data stream. Such objects are referenced by a name, which is an
arbitrary string of printable ASCII characters. The resolution of
such names is not defined by the pickle module -- the
persistent object module will have to implement a method
persistent_load(). To write references to persistent objects,
the persistent module must define a method persistent_id() which
returns either None
or the persistent ID of the object.
There are some restrictions on the pickling of class instances.
First of all, the class must be defined at the top level in a module. Furthermore, all its instance variables must be picklable.
When a pickled class instance is unpickled, its __init__() method is normally not invoked. Note: This is a deviation from previous versions of this module; the change was introduced in Python 1.5b2. The reason for the change is that in many cases it is desirable to have a constructor that requires arguments; it is a (minor) nuisance to have to provide a __getinitargs__() method.
If it is desirable that the __init__() method be called on
unpickling, a class can define a method __getinitargs__(),
which should return a tuple containing the arguments to be
passed to the class constructor (__init__()). This method is
called at pickle time; the tuple it returns is incorporated in the
pickle for the instance.
Classes can further influence how their instances are pickled -- if
the class
Note that when class instances are pickled, their class's code and data are not pickled along with them. Only the instance data are pickled. This is done on purpose, so you can fix bugs in a class or add methods and still load objects that were created with an earlier version of the class. If you plan to have long-lived objects that will see many versions of a class, it may be worthwhile to put a version number in the objects so that suitable conversions can be made by the class's __setstate__() method.
When a class itself is pickled, only its name is pickled -- the class definition is not pickled, but re-imported by the unpickling process. Therefore, the restriction that the class must be defined at the top level in a module applies to pickled classes as well.
The interface can be summarized as follows.
To pickle an object x
onto a file f
, open for writing:
p = pickle.Pickler(f) p.dump(x)
A shorthand for this is:
pickle.dump(x, f)
To unpickle an object x
from a file f
, open for reading:
u = pickle.Unpickler(f) x = u.load()
A shorthand is:
x = pickle.load(f)
The Pickler class only calls the method f.write()
with a
f.read()
(with an integer argument) and f.readline()
(without argument),
both returning a string. It is explicitly allowed to pass non-file
objects here, as long as they have the right methods.
The constructor for the Pickler class has an optional second argument, bin. If this is present and true, the binary pickle format is used; if it is absent or false, the (less efficient, but backwards compatible) text pickle format is used. The Unpickler class does not have an argument to distinguish between binary and text pickle formats; it accepts either format.
The following types can be pickled:
None
Attempts to pickle unpicklable objects will raise the PicklingError exception; when this happens, an unspecified number of bytes may have been written to the file.
It is possible to make multiple calls to the dump() method of the same Pickler instance. These must then be matched to the same number of calls to the load() method of the corresponding Unpickler instance. If the same object is pickled by multiple dump() calls, the load() will all yield references to the same object. Warning: this is intended for pickling multiple objects without intervening modifications to the objects or their parts. If you modify an object and then pickle it again using the same Pickler instance, the object is not pickled again -- a reference to it is pickled and the Unpickler will return the old value, not the modified one. (There are two problems here: (a) detecting changes, and (b) marshalling a minimal set of changes. I have no answers. Garbage Collection may also become a problem here.)
Apart from the Pickler and Unpickler classes, the module defines the following functions, and an exception:
See Also: