The Wave by Rick Z..
An Evented I/O B-tree for Node.js.
A b‑tree is a data structure used by databases to store records organized in large pages on disk.
By concurrent I mean that multiple queries can make progress on a descent of the b‑tree. Multiple reads can all navigate the b‑tree simultaneously, of course. Multiple reads can also make progress in the presence of a write, so long as they are not reading a page that is being written. This is the equivalence to "threading" in other database engines, but evented for Node.js.
Strata is a database primitive, it is not supposed to be used a as a general purpose database by it's lonesome, but an interface to a b‑tree and it's concepts that you can use to create different types database strategies.
The interface to Strata is not an API, it is a programmer's interface to b‑tree concepts. It is easy to use, if you know how a b‑tree works, but please don't complain about encapsulation; it is not a database engine, it is a b‑tree structure and the details are supposed to be exposed.
The Strata b‑tree interface describes a b‑tree as a collection of actors, not a collection of objects. A b‑tree isn't all about "pages." It's about descending, navigating, appending, and balancing a tree. When you read the code, you're going to find these people-named classes who do things.
Finally, Strata is an ancient project of mine, that began before I really know how a Node.js library is supposed to look, so I used closure based objects, which is a way to go, but most noders use prototype based objects. That's what I'd do I was to do it all over again, or maybe not; because I like the way the code turned out.
I'm going to cut this whinging in the final
README.md. It's here to vent my
defensiveness and remind of who my audience is; people who are experimenting
with their own database structure for their own domain-specific database.
I use a control-flow library of my own creation called Cadence that I'm just crazy about.
The primary benefit of Cadence it asynchronous try/catch error handling. This has made it very easy to create deeply nested asynchronous operations, yet have errors propagate up to the caller, the context that comes from maintaining a stack and unwinding it on error.
Secondary benefit is that Cadence always uses a trampoline. This allows me to cache aggressively without having to worry about blowing the stack.
Install from NPM.
npm install b-tree
TK: Unique keys, but duplicate keys are super easy to fake.
You must create the b‑tree object first, specifying the size of the inner branch pages as a count of child pages, and the size of the leaf pages as a count of stored records.
var openOrCreate = cadencevar strata = directory leafSize: 1024 branchSize: 1024 ;asyncstratacreateasyncstratamutate
Properties to the constructor...
Constructs a new b-tree that stores its files in the directory provided by
location. It does not open or close the b‑tree.
new Strata() takes an optional options object as its second argument; the
following properties are accepted:
extractor: A function that extracts the key from the record.
comparator: A function that is used to compare keys.
leafSize: The maximum size in records of a leaf page before it is it split.
branchSize: The maximum size in child pages of a branch page before it is split.
checksum: A cryptographic algorithm to use as a hash, or a checksum function to validate each line in a leaf page, and the contents of a branch page.
Opens the b-tree.
Creates a new, empty b-tree. It will raise an exception if there is anything in the location directory.
You search and edit the b‑ separate from editing it.
With Strata you either create read-only iterator, or a read/write mutator. The mutator is a superset of the iterator so let's start there.
var hasKey = cadenceasyncstrataiteratorsought checkatLeaffound = cursorindex >= 0asynccursorunlockasyncreturn foundhasKeystrata 'c'if error throw errorif exists console.log'I found it.'
In the above, we create a read-only
Cursor using the
function. That returns an iterator that holds a shared lock on the leaf page
that either contains the records for the given key, or else would contain the
record for the given key if it existed in the leaf page. The
Cursor says that
the record is here, or it should go here.
Cursor.index property is zero or more, it is the index of the record in
the leaf page. If the
Cursor.index property is less than zero, then it's
compliment is the index of where the record should go in the leaf page.
hasKey function above we simply return whether or not the record exists
based on the cursor index.
Ed: The following is stupid wrong and stupid.
var range = cadencestrataiteratorstart checkatLeaf;fetchcursorindex < 0 ? ~cursorindex : cursorindex;if index < cursorlengthcursorgetindex checkpush;elsecursornextcheckadvanced;if record < stopfoundpushrecord;fetchindex + 1;elsedone;if success done;else fetch0;cursorunlock;callbacknull found;rangestrata 'c' 'i'if error throw error;console.logfound;;