MEP	4
Title	Remove Google ProtoBuf Encoding From The Protocol
Status	Accepted
Created	2014-11-18
Author	Otto Allmendinger

Remove Google ProtoBuf Encoding From The Protocol

Abstract

In general, all messages sent between Client and Notary should be human-readable. This helps developers understand and debug message content, and is necessary in cases where messages or message content needs to be audited by third parties in case of a dispute.

The protocol should also be as simple as possible and use as few encoding schemes as necessary.

This MEP suggests the removal of the encoding defined by the Google ProtoBuf library in order to increase human readability and simplify the protocol.

Current status

Open-Transactions uses four different encodings:

"Section Format". Used at the top level of any signed document. This is OpenPGP-inspired format that separates signed content from signatures using dashed line markers: -----BEGIN SIGNED CONTRACT-----.
XML. The document payload is usually encoded using XML. With MEP-3, it will be valid XML as well.
ASCII-Armoring. Used for binary data and lossless embedding of sub-documents. This encoding uses three steps.
Pack byte sequence using Protocol Buffers
Use zlib DEFLATE algorithm (strings only)
Encode byte sequence using Base64
Protocol-Buffers. Used for encoding structured data. Certain data types like key-value pairs and market data are encoded into byte sequences using Protocol Buffers. (The resulting byte sequence is then encoded using ASCII-Armoring as well, where it will be ProtoBuf-encoded as well)

Strictly speaking, the use of ProtoBuf is dependent on a compile-time flag that allows using another "packer" instead of ProtoBuf. This is not used in practice. Allowing for different packers would make the description of the protocol even more difficult.

(Side-note: The use of ASCII-Armoring for embedding plain-text data eliminates human readability and dramatically increases file size after compression. Due to the difficulty of changing that in the current code, it will be dealt with in a future MEP)

Drawbacks

The Google ProtoBuf library is a dependency for any component that wants to read or write messages or documents in Open-Transactions. Using the library requires the definitions of the particular data types that are encoded and decode. This makes reading and writing protocol messages significantly more complex.

ProtoBuf encodes data types into byte sequences, which are not human-readable and must be radix-encoded when embedded inside an XML document.

ProtoBuf in ASCII-Armoring

Here, Protocol Buffers are used to encode a byte sequence into another byte sequence. This is essentially a NOOP that serves no discernible purpose.

ProtoBuf for Structured Data

Only Section-Format and XML can be described as human-readable. In the case of binary data (hash values and signature data), radix encoding cannot be avoided and improve human readability, like base64 encoding of signatures and base58 encoding of hash values. In all other cases, radix encoding decreases human readability. By encoding structured data using ProtoBuf instead of XML, armoring must be used and human readability is lost.

Enhancement

The removal of the ProtoBuf from protocol simplifies the complexity of the protocol and improves human readability in some cases.

Remove ProtoBuf from ASCII-Armoring

This simplifies the decoding and encoding algorithms for "ASCII-Armored" data.

Replace ProtoBuf with XML for Structured Data

This increases human readability of protocol messages and simplifies encoding and decoding of document types.

Impact

Removal of the packing stage in the class ASCIIArmor. This is a simple change that only requires sample data to be recreated.
Replacement of encoding of complex data types (e.g. Key-Value pairs, Marked Data) with XML document types. This requires some design effort, changes in the writing/parsing code and recreation of sample data.