swarm-peer

1.2.0 • Public • Published

Swarm peer

Swarm is a sync-centric database implementing Commutative Replicated Data Types (CmRDTs). CmRDTs are op-based, hence Swarm is built on top of a partially ordered log of operations, very much like classic databases are built on top of totally ordered logs.

A peer is the part of Swarm that does "synchronization" per se. Peers disseminate and store the op log. They also serve it to clients which implement actual CRDT datatypes (Hosts, see swarm-syncable). A peer is mostly oblivious to data types and logic; its mission is to get all the ops delivered to all the object's replicas, peferably once.

This Peer implementation keeps its data in a storage engine accessed through the [LevelUp][levleup] interface. Normally, that is LevelDB on the server, IndexedDB on the client (in case you'd like to download a full db into the browser).

  • LevelOp - op database implemented on top of LevelDOWN
  • {OpStream} LogOpStream - implements partially ordered log handling
  • {OpStream} PatchOpStream - manages patches
    • pass-through for mutations
    • accepts on/offs
    • reads db, produces patches
    • optionally, produces snapshots, writes them to db
    • emits mutations (p-t), patches (scoped), reciprocal ons/offs
  • {OpStream} SwitchOpStream - sends/receives ops to clients, manages subscription tables

OpStreams

Some little clarification on the Swarm domain model and terms. What are users, sessions, clocks, databases and subscriptions (ons) ?

First of all, a user is an "end user" identified by a login (alphanum string up to 32 characters long, underscores permitted, like gritzko). A user may have an arbitrary number of sessions (like apps on mobile devices, browser tabs, desktop applications). Sessions have unique identifiers too, like gritzko~1kz (tilde then a serial in Base64). Session ids may be recursive, like gritzko~1kz~2. Each session has a clock that produces a monotonous sequence of Lamport timestamps or simply stamps which consist of the local time value and session id, like 2Ax1k+gritzko~1kz. Every op is timestamped at its originating process.

Swarm's synchronized CRDT objects are packed into databases identified by alphanumeric strings. A session may use multiple databases, but as the relative order of operations in different databases does not matter, each db is subscribed to in a separate connection. The implementation may or may not guarantee to preserve the relative order of changes to different objects in the same database. The client's session is linked to a de-facto local process (and storage), so it is likely to be shared between dbs (same as clocks).

Per-object subscriptions are optional (that depends on a particular database). Similarly, access control policies are set at per-db granularity (sample policy: the object's owner can write, others can read).

The most common interface in the system is an OpStream. That is a single-database op stream going from one session to another.

TODO

Rework (1.1)

Goals: manageable state snapshotting, general clean-up and simplification. Storage/network/subscriptions go to Replica entirely; Entry becomes passive, merges with EntryState.

Method: rewire refactoring. Stages:

  • send ~ done ~ save
  • read db by callbacks
  • O queue
  • I queue
  • Entry ~ EntryState

Full job list:

  • move subscribers, write to Replica (replica.appendNewOp(op))
    • subscribers
    • append new op
    • send
    • save
    • Q selective ack, error -- mailbox ? entry.error
  • replica.snapshotting[typeid] -> [stream_id], check on relaying new ops
  • replica to save meta, Entry to stay passive
    • replica.op_queue, backpressure
    • unite Entry/EntryState
    • replica.readTail(fn, end) -> fn(op)* fn(null)|end()
    • replica to maintain a common pending queue
  • descending state hooks (last snapshot size, tail size)

Readme

Keywords

none

Package Sidebar

Install

npm i swarm-peer

Weekly Downloads

0

Version

1.2.0

License

MIT

Last publish

Collaborators

  • gritzko