Documents

A Document is a high-level mutable, multi-writer data type constructed from a linked graph of operations.
Through a deterministic process the graph can be reduced to a single key-value map.
Any two documents (replicas) which contain the same collection of operations will resolve to the same value.
A document is identified by the operation id of its root CREATE operation (aka document_id).
A document assumes the schema of its root CREATE operation
A document is made up of operations published by one or many authors
- Branches in a document's graph occur when two authors publish operations concurrently
Every operation has a previous field containing a document_view_id which refers to document state at the moment the operation was encoded
- These previous references make up the edges in a graph, the operations being the nodes.
- The graph describes the causal relationship between all operations in a document.

🐻‍❄️Fun fact

Some things that may be a document in p2panda: a blog post, a wiki page, a chat message, a user account, a configuration setting, a game board.

🌩️Requirement DO1

A document MUST contain exactly one CREATE operation.

🌩️Requirement DO2

A document's operation graph MUST NOT contain any cycles.

🌩️Requirement DO3

A document MUST NOT contain an operation who's previous refers to an operation not present in the document's graph.

🌩️Requirement DO4

A document MAY contain any number of DELETE operations. Document's which contain one or more DELETE operations no longer produce a materialised view.

🌩️Requirement DO5

A documents' operations MUST be encoded on entries which are published to a single log for each contributing author/public key.

Viewing a document

When viewing documents, it's state must be reduced to a single key-value map, this process involves two steps:

🐼Definition: Materialisation

Although here we describe the resolving an operation graph as a property of the data type document it can also be seen as the process of materialisation. This is a term borrowed from database terminology, where views on data can be materialised into virtual tables. This is a useful concept in p2panda and one that is used often.

1. Reconciliation

The first step we take is to sort and linearise the document's graph of operations deterministically.
We do this by applying a topological depth-first sorting algorithm which meets the following requirements:

🌩️Requirement DO6

Sorting MUST start from the document's CREATE operation.

🌩️Requirement DO7

An operation which refers to the current operation in its previous field MUST be sorted next.

🌩️Requirement DO8

If multiple operations refer to the current, the one with the lowest document_id MUST be sorted next.

🌩️Requirement DO9

When visiting a branch, all operations it contains MUST be visited and sorted before continuing to the rest of the graph.

🌩️Requirement D10

All operations in the graph MUST be sorted exactly once.

🌩️Requirement D11

If any DELETE operation is visited, materialisation of the document MUST stop immediately. The resulting document view id MUST include only the id of the DELETE operation and a document view SHOULD NOT be produced.

2. Reduction

The second and final step is to reduce the linearised list of operations into a single key-value map by applying the following rules:
1. Deserialise all fields of the document's CREATE operation to produce a document view
2. If the next operation in the document is an UPDATE operation
  - for every field in the operation
    - overwrite this field's contents on the view with the contents from the operation
3. If the next operation in the document is a DELETE operation
  - remove the content on all fields of the view
  - mark the view delete
  - stop reduction here
4. Stop reduction if there is no next known operation in the document
5. Continue with step 2. otherwise

Viewing a document​

1. Reconciliation​

2. Reduction​

Viewing a document

1. Reconciliation

2. Reduction