Module sndjvu_format::ser

source ·
Expand description

Serialization to the DjVu transfer format.

If you have a Rust data type that represents a DjVu document, implementing this module’s Serialize trait allows you to turn a value of that type into a blob of bytes in the transfer format. The structure of the document is expressed by a sequence of method calls on various “serializer” objects, such as Serializer, SerializeMultiPageBundled, etc. This approach should be familiar from APIs like std::fmt::Debug and serde::Serialize.

Two-pass approach

Various fields in the DjVu transfer format describe the size/length or offset of some part of the document. Let’s call these “length fields”. A core goal of this module is to take complete responsibility for computing the correct values for such fields, since these computations are tightly coupled to the gritty details of the transfer format.

Length fields present some obstacles to an elegant serialization API. Consider the DIRM chunk, which appears at the beginning of a multi-page document and contains some metadata about the components, including their offsets and sizes. The offsets appear in the “plain” (uncompressed) portion of the DIRM chunk, but the sizes are BZZ-compressed. This makes it impossible to emit a well-formed DIRM chunk until we know the entire structure of the document in detail, so that we can compute the size of each component, compress that data, and then compute the final offset of each component. (Because the DIRM chunk appears before the components, and because the size of the BZZ-compressed portion can’t be determined without compressing the exact data in question, we really have to do things in this order.)

The only solution, if we care about hiding the intricacies of the transfer format from downstream, is to split serialization into two passes. On the first pass, we don’t emit any bytes, but only collect enough information to compute the value of every length field. On the second pass, we use that stored information to emit the bytes of the document, in order, with their correct values.

The trick that this module pulls off is to encapsulate the two-pass nature of serialization. You just implement Serialize, and a function like to_vec takes care internally of calling Serialize::serialize twice, with two different Serializers, one for the first pass and one for the second.

Structs

  • Builder for the (uncompressed) data in an ANTa or ANTz chunk.
  • Builder for the (uncompressed) data in a NAVM chunk.
  • An error encountered during serialization.
  • Opaque token returned from successful serialization.
  • Serializer for a sequence of BG44 chunks within a DJVI or DJVU component.
  • Serializer for the components of a multi-page document.
  • Serializer for the elements of a DJVU or DJVI component.
  • Serializer for the “head” data of a multi-page document (DIRM and NAVM chunks).
  • Serializer for the chunks of a THUM component.
  • The starting point for serialization.
  • Builder for the (uncompressed) data in a TXTz chunk.
  • Builder for the “zones” data in a TXTa chunk.

Enums

Traits

  • Interface for describing the structure of a DjVu document.

Functions

  • Serialize a document into a buffer of bytes.
  • Serialize a document into a provided writer.