Meri Leeworthy

Beelay

Practical Rateless Set Reconciliation

Has access to list of members for each document but not the content

We try to hide the links between documents but we show the relationship between compressed chunks

we split the document into as few compressed chunks as possible

https://github.com/automerge/beelay/blob/main/docs/protocol.md :

Beelay - A new sync protocol for Automerge

This document describes a new sync protocol - tentatively named "Beelay" - designed for efficiently synchronizing collections of Automerge documents.

Introduction and motivation

An Automerge document can be thought of as like git for JSON. Each document is a data structure which can be materialized as a JSON object, but modifications to the document are stored within the document as a commit DAG rather like Git. This means that Automerge documents can be edited concurrently and merged together automatically later, providing a substrate for collaboration without central peers.

One common requirement for collaborative software is real-time communication between concurrently editing peers. For this purpose Automerge includes a sync protocol which enables peers to send deltas rather than sending the entire document on every change - for things like per-keystroke editing this is essential. This sync protocol works but production use has exposed a few limitations: firstly, that running sync servers is expensive and compromises security and secondly, that applications frequently want to synchronize many documents.

Sync Servers

Due to the structure of networks on the internet, and due to requirements for data availability, it is often necessary to have a central server which acts as a relay and store-and-forward peer for sync messages. Unlike peers running on behalf of a single user these peers must synchronize many documents at once. The existing sync protocol requires that the entire document be in memory in order to run the sync protocol which limits the scale these servers can affordably reach.

In addition, one of the appealing things about Automerge is that you don't need to trust a central server. Introducing a sync server somewhat compromises this feature - you must trust the server operator not to look at your data and to keep it secure.

Collections of Documents

An Automerge document is a "unit of sharing". You can't share subcomponents of a document with other people - or view the history of just one part of the document. This is a useful property because it makes the behavior of a document predictable in the face of concurrent changes but it is also a limitation because many useful applications involve sharing subsets of a users data with different people.

To work around this limitation applications often create many Automerge documents and link them together via a location independent url. This introduces challenges for the sync protocol. We have users who have thousands of documents - synchronizing just the small number of documents which have changed in one session is very inefficient with the current scheme - which must load every document into memory and create a separate sync session for each one.

Requirements

These two sets of problems are technically unrelated, what brings them together for our purposes is that they can both be solved by a new sync protocol. Specifically what we require is a sync protocol which

  • Does not impose O(n) memory requirements on sync servers where n is the number of documents being synchronized

  • Allows for sync servers to operate over encrypted data to reduce the trust users have to place in sync servers

  • Provides a mechanism for efficiently determining what documents in a collection of documents have changed

Design

Beelay is an RPC based protocol which operates between two peers. We make no assumptions about the ordering of messages on the channel connecting these two peers. In typical interactions one peer will be a "sync server" and one will be a user agent within some application, although the protocol doesn't require this topology.

To begin synchronizing a document with a remote peer the local Beelay peer first requests that the remote peer create a "snapshot" representing the current state of the given document and every document transitively reachable (via links) from it. The remote peer creates this snapshot and returns an identifier the local peer can use to perform [RIBLT-sync] with the snapshot. Once this sync is complete the local peer knows which documents in the collection are out of sync.

At this point the local peer runs sedimentree sync for each out of sync document. Once this is complete the local peer knows that it is at least up to date with the state of the collection at the time of the snapshot. Finally the local peer can [listen] to any changes to documents in the snapshot which have occurred since the snapshot was created. Thus the local peer can stay in sync with live updates.

It must be possible to run this synchronization protocol over encrypted data which means that we cannot directly examine the contents of the documents being synchronzed in order to extract the links between them. Instead, peers which have the clear text of the documents synchronize the links between documents to a "reachability index" on the server. This index is a very simple CRDT which is also synchronized via [sedimentree sync].

RIBLT Sync

RIBLT sync refers to "Rateless Invertible Bloom Lookup Tables" as presented in Practical Rateless Set Reconciliation. This scheme allows for a set of items to be reconciled between two peers with a bandwidth overhead which is proportional to the size of the set difference and with a very low number of round trips.

Sedimentree Sync

Sedimentree sync is a scheme we have designed for synchronizing commit DAGs such as those which make up an Automerge document. The important features of sedimentree sync are that it allows for compressing runs of operations in the commit DAG and ommitting their change hashes from the compressed runs. This is crucial as it allows us to keep a very granular commit history without using enormous amounts of storage and/or bandwidth. See [sedimentree.md] for more details.

Messages

Beelay messages are encoded in a binary format which we describe here. With the following additional notation:


byte = %x00-FF

        ; any octet

bytes = 1*byte

        ; any sequence of octets

uleb128 = 1*8( %x00-7F ) / ( %x80-FF bytes )

        ; unsigned LEB128 encoding

leb128 = 1*8( %x00-7F ) / ( %x80-FF bytes )

        ; signed LEB128 encoding

With the exception of notifications sent in response to listen requests, every message is either a request or a response.


message = message_type ((request_id (request / response)) /  notification)

message_type = %d00 ; request

    / %d01 ; response

    / %d02 ; notification

request_id = 16(byte)



request = create_snapshot_request

    / snapshot_symbols_request

    / fetch_sedimentree_request

    / fetch_blob_part_request

    / upload_commits_request

    / upload_blob_request

    / listen

response = create_snapshot

    / snapshot_symbols_response

    / fetch_sedimentree_response

    / fetch_blob_part_response

    / upload_commits_response

    / upload_blob_response

    / listen

Create Snapshot

The create snapshot request contains just the document ID of the root document to create a snapshot from.


create_snapshot_request = %d04 ; request type

    uleb128 ; length of root document ID

    bytes ; root document ID

The response to a create snapshot request contains the snapshot ID and the first "coded symbols" from the RIBLT sync for the snapshot in question. The receiver should attempt to peel these symbols and if they find they still need more, then use the snapshot_symbols request to request more symbols.


create_snapshot_response = %d04 ; response type

    16(byte) ; snapshot ID

    uleb128 ; number of coded symbols

    *coded_symbol



coded_symbol = 16(byte) ; first part of symbol which is a document ID when peeled

    32(byte) ; second part of symbol, which is the hash of the heads of the 

             ; document when peeled

Snapshot Symbols

Used to request additional RIBLT coded symbols for an existing snapshot.


snapshot_symbols_request = %d05 ; request type

    16(byte) ; snapshot ID



snapshot_symbols_response = %d05 ; response type

    uleb128 ; number of coded symbols

    *coded_symbol

Fetch Sedimentree

Request to fetch the Sedimentree (both content and reachability index) for a specific document.


fetch_sedimentree_request = %d01 ; request type

    uleb128 ; length of document ID

    bytes ; document ID



fetch_sedimentree_response = %d02 ; response type

    sedimentree_data



sedimentree_data = %d00 ; not found

    / (%d01 ; found

       content_summary

       index_summary)



content_summary = uleb128 ; number of bundles

    *sedimentree_bundle



index_summary = uleb128 ; number of bundles  

    *sedimentree_bundle



sedimentree_bundle = bytes ; bundle data

Fetch Blob Part

Request to fetch a portion of a blob by its hash.


fetch_blob_part_request = %d02 ; request type

    32(byte) ; blob hash

    uleb128 ; offset

    uleb128 ; length



fetch_blob_part_response = %d03 ; response type

    uleb128 ; length of blob part

    bytes ; blob part data

Upload Commits

Request to upload commits for a document.


upload_commits_request = %d00 ; request type

    uleb128 ; length of document ID

    bytes ; document ID

    commit_category ; single byte indicating commit category

    uleb128 ; number of upload items

    *upload_item



upload_commits_response = %d01 ; response type

    ; empty response indicating success



upload_item = blob_ref tree_part



blob_ref = %d00 32(byte) ; blob hash

    / (%d01 ; inline blob

       uleb128 ; blob length

       bytes) ; blob data



tree_part = %d00 ; stratum

    (%d00 / (%d01 32(byte))) ; optional start commit hash

    32(byte) ; end commit hash

    uleb128 ; number of checkpoints

    *32(byte) ; checkpoint hashes

    / (%d01 ; commit

       32(byte) ; commit hash

       uleb128 ; number of parents

       *32(byte)) ; parent hashes

Upload Blob

Request to upload a complete blob.


upload_blob_request = %d03 ; request type

    uleb128 ; blob length

    bytes ; blob data



upload_blob_response = %d01 ; response type

    ; empty response indicating success

Listen

Request to listen for updates to a snapshot. Any changes to the documents reachable from the snapshot which have been discovered by the server since the snapshot was created will be delivered as "notification" messages, which are the only kind of message sent without an explicit request.


listen_request = %d06 ; request type

    16(byte) ; snapshot ID



listen_response = %d06 ; response type

    ; empty response indicating successful subscription



notification = peer_id

    doc_id

    upload_item



peer_id = uleb128 ; length of peer ID

    bytes ; peer ID bytes



doc_id = uleb128 ; length of document ID

    bytes ; document ID bytes

Notifications

Notifications are sent to a peer when a change is discovered to a document which is transitively reachable from a snapshot the peer issued a listen request to.


notification = from_peer

    document_id

    upload_item



from_peer = uleb128 ; length of peer ID bytes

    bytes ; peer ID



document_id = uleb128 ; length of document ID bytes

    bytes ; document ID



upload_item = blob_ref tree_part



blob_ref = %d00 32(byte) ; blob hash reference

    / (%d01 ; inline blob

       uleb128 ; blob length

       bytes) ; blob data



tree_part = %d00 ; stratum

    (%d00 / (%d01 32(byte))) ; optional start commit hash

    32(byte) ; end commit hash

    uleb128 ; number of checkpoints

    *32(byte) ; checkpoint hashes

    / (%d01 ; commit

       32(byte) ; commit hash

       uleb128 ; number of parents

       *32(byte)) ; parent hashes

Error Response

Any request can result in an error response:


error_response = %d00 ; error response type

    uleb128 ; length of error message

    bytes ; UTF-8 encoded error message

Backlinks

I live and work on the land of the Wurundjeri people of the Kulin Nation. I pay respect to their elders past and present and acknowledge that sovereignty was never ceded. Always was, always will be Aboriginal land.

This site uses open source typefaces, including Sligoil by Ariel Martín Pérez, and Vercetti by Filippos Fragkogiannis