Module sndjvu_format::ser

Expand description

Serialization to the DjVu transfer format.

If you have a Rust data type that represents a DjVu document, implementing this module’s Serialize trait allows you to turn a value of that type into a blob of bytes in the transfer format. The structure of the document is expressed by a sequence of method calls on various “serializer” objects, such as Serializer, SerializeMultiPageBundled, etc. This approach should be familiar from APIs like std::fmt::Debug and serde::Serialize.

Two-pass approach

Various fields in the DjVu transfer format describe the size/length or offset of some part of the document. Let’s call these “length fields”. A core goal of this module is to take complete responsibility for computing the correct values for such fields, since these computations are tightly coupled to the gritty details of the transfer format.

Length fields present some obstacles to an elegant serialization API. Consider the DIRM chunk, which appears at the beginning of a multi-page document and contains some metadata about the components, including their offsets and sizes. The offsets appear in the “plain” (uncompressed) portion of the DIRM chunk, but the sizes are BZZ-compressed. This makes it impossible to emit a well-formed DIRM chunk until we know the entire structure of the document in detail, so that we can compute the size of each component, compress that data, and then compute the final offset of each component. (Because the DIRM chunk appears before the components, and because the size of the BZZ-compressed portion can’t be determined without compressing the exact data in question, we really have to do things in this order.)

The only solution, if we care about hiding the intricacies of the transfer format from downstream, is to split serialization into two passes. On the first pass, we don’t emit any bytes, but only collect enough information to compute the value of every length field. On the second pass, we use that stored information to emit the bytes of the document, in order, with their correct values.

The trick that this module pulls off is to encapsulate the two-pass nature of serialization. You just implement Serialize, and a function like to_vec takes care internally of calling Serialize::serialize twice, with two different Serializers, one for the first pass and one for the second.

Structs

AnnotBuf
Builder for the (uncompressed) data in an ANTa or ANTz chunk.
BookmarkBuf
Builder for the (uncompressed) data in a NAVM chunk.
Error
An error encountered during serialization.
Okay
Opaque token returned from successful serialization.
SerializeBg44Chunks
Serializer for a sequence of BG44 chunks within a DJVI or DJVU component.
SerializeComponents
Serializer for the components of a multi-page document.
SerializeElements
Serializer for the elements of a DJVU or DJVI component.
SerializeMultiPageHead
Serializer for the “head” data of a multi-page document (DIRM and NAVM chunks).
SerializeThumbnails
Serializer for the chunks of a THUM component.
Serializer
The starting point for serialization.
TxtBuf
Builder for the (uncompressed) data in a TXTz chunk.
ZoneBuf
Builder for the “zones” data in a TXTa chunk.

Enums

SerializeMultiPageBundled
Serializer for a bundled multi-page document.

Traits

Serialize
Interface for describing the structure of a DjVu document.

Functions

to_vec
Serialize a document into a buffer of bytes.
to_writerstd
Serialize a document into a provided writer.