BinPickle¶
The BinPickle library provides an optimized file format for serializing Python objects
in a scientific computing setting. It uses Pickle Protocol 5 (with the pickle5
library on older versions of Python) to efficiently serialize objects with large
binary data blobs such as NumPy arrays; one of the primary use cases for BinPickle
is efficiently serializing scikit-style statistical and machine learning models.
BinPickle supports a few useful features on top of standard pickling:
Optional per-buffer compression
Memory-mapped buffers (when uncompressed) for efficiently sharing
BinPickle wraps Python’s pickling functionality, so any object that can be pickled (including SciKit models) can be stored with BinPickle. If the object supports Pickle Protocol 5 (or stores most of its data in NumPy arrays, which in recent versions support Pickle 5), then large array data will be efficiently stored, either compressed (with Blosc compression by default) or page-aligned and ready for memory-mapping, possibly into multiple processes simultaneously.
Quick Start¶
Save an object:
from binpickle import dump, load
dump(my_large_object, 'file.bpk')
Load an object:
model = load('file.bpk')