Format¶
The binpickle.format
module contains the data structures that define the
BinPickle format.
Users will not need these classes. They are documented here in the interest of documenting the file format.
File Structure¶
BinPickle uses Pickle 5’s out-of-band buffer serialization support, and thus stores the pickled object in two parts:
The contents of the out-of-band buffers.
The Protocol 5 pickled bytes.
The bytes are stored as another buffer, so pickling an object with n buffers stores n+1 buffers in the file, the last one of which contains the pickle bytes.
The BinPickle format is inspired by Zip, with an index at the end of the file that tells the reader where in the file to find the various contents.
A Version 1 BinPickle file is organized as follows:
16-byte header, beginning with magic bytes
BPCK
(seeFileHeader
).The out-of-band buffers, in order. Padding may appear before or after any buffer’s contents.
The pickle bytes, as a buffer.
The file index, stored as a list of
IndexEntry
objects encoded in MsgPack.16-byte trailer (see
FileTrailer
).
The position and length of each buffer is stored in the index, so buffers can have arbitrary
padding between them. They could even technically be out-of-order, but such a file should
not be generated. Uncompressed BinPickle files intended for memory-mapped use align each
buffer to the operating system page size (from mmap.PAGESIZE
).
Classes¶
- class binpickle.format.FileHeader(version: int = 1, length: int = - 1)¶
File header for a BinPickle file. The header is a 16-byte sequence containing the magic (
BPCK
) followed by version and offset information:File version (2 bytes, big-endian). Currently only version 1 exists.
Reserved (2 bytes). Set to 0.
File length (8 bytes, big-endian). Length is signed; if the file length is not known, this field is set to -1.
- encode()¶
Encode the file header as bytes.
- classmethod decode(buf, *, verify=True)¶
Decode a file header from bytes.
- trailer_pos()¶
Get the position of the start of the file trailer.
- class binpickle.format.FileTrailer(offset: int, length: int, checksum: int)¶
File trailer for a BinPickle file. The trailer is a 16-byte sequence that tells the reader where to find the rest of the binpickle data. It consists of the following fields:
Index start (8 bytes, big-endian). Measured in bytes from the start of the file.
Index length (4 bytes, big-endian). The number of bytes in the index.
Index checksum (4 bytes, big-endian). The Adler32 checksum of the index data.
- encode()¶
Encode the file trailer as bytes.
- classmethod decode(buf, *, verify=True)¶
Decode a file trailer from bytes.
- class binpickle.format.IndexEntry(offset: int, enc_length: int, dec_length: int, checksum: int, codec: Optional[tuple] = None)¶
Index entry for a buffer in the BinPickle index.
- to_repr()¶
Convert an index entry to its MsgPack-compatible representation
- classmethod from_repr(repr)¶
Convert an index entry from its MsgPack-compatible representation