bitfield-db

1.2.3 • Public • Published

bitfield-db

bitfield database with rank+select over persistent storage

based on roaring+run and fenwick trees with 0 or 1 based numbering as in the louds paper

You might want to use a database like this when you have data like an inbox and you'd like to sort messages into different categories like "read", "unread", "spam", and "archive" and also where you might want to calculate totals for each category (rank) and skip to the next or previous message in each category (implemented internally using select+rank).

Pure javascript implementation and works using the random-access suite of adaptors. To use this module in the browser, you can use random-access-web.

example

var Bitfield = require('bitfield-db')
var raf = require('random-access-file')
var bf = new Bitfield(raf('/tmp/bitfield.db'))
 
;[5,10,15,20,23,24,26,27].forEach(x => bf.add(x))
 
bf.flush(function (err) {
  if (err) return console.error(err)
  bf.rank(24, function (err, res) {
    if (err) console.error(err)
    else console.log(`rank(24) = ${res}`)
  })
  bf.select(5, function (err, res) {
    if (err) console.error(err)
    else console.log(`select(5) = ${res}`)
  })
  bf.predecessor(24, function (err, res) {
    if (err) console.error(err)
    else console.log(`predecessor(24) = ${res}`)
  })
  bf.successor(24, function (err, res) {
    if (err) console.error(err)
    else console.log(`successor(24) = ${res}`)
  })
})

output:

rank(24) = 5
select(5) = 24
successor(24) = 26
predecessor(24) = 23

one vs zero based numbering

This library supports both one-based and zero-based numbering schemes.

All the methods with a trailing 1 in their name use one-based numbering.

All the methods with a trailing 0 in their name use zero-based numbering.

One-based numbering is the default. Under this scheme, rank1(x) counts the number of elements contained in the set < x. When an element is in the set, it is a "one" because the entry for its bitfield (or array or run) is "set" as a 1.

The methods rank(), select(), predecessor(), and successor() are aliases for the corresponding methods: rank1(), select1(), predecessor1(), and successor1().

Under zero-based numbering, rank0(x) counts the number of elements (non-negative integers) NOT contined in the set which are < x. When an element is not contained in the set, it is a "zero" because the entry for its bitfield (or array or run) is "unset" as a 0.

api

var Bitfield = require('bitfield-db')
var MultiBitfield = require('bitfield-db/multi')

var bf = new Bitfield(storage, opts)

Create a new bitfield-db instance bf from a random-access store instance and optionally:

  • opts.bits - maximum size of numbers in the set (2**bits). default: 32
  • opts.prefix - string prefix to prepend to all keys stored in the database
  • opts.db - tinybox-compatible instance to store data. Otherwise a database is created automatically.

Due to limitation of javascript's builtin number type, setting bits more than 50 may cause errors. Operations will be faster and take less storage with a smaller bit size.

If you use opts.prefix, make sure to properly delimit your records by passing a trailing delimiter and disallowing the delimiter user input.

bf.add(x)

Add x to the set in the current batch.

Only positive integers including zero up to the precision set as opts.bits are allowed for x.

You will need to bf.flush(cb) the database before queries will reflect this action.

bf.delete(x)

Delete x from the set in the current batch.

Only positive integers including zero up to the precision set as opts.bits are allowed for x.

You will need to bf.flush(cb) the database before queries will reflect this action.

bf.flush(opts={}, cb)

Commit the current batch to the database. cb(err) is called when the operation completes.

When opts.sync === false, only queue the underlying database writes, do not flush the underlying database. Use this option if you need fine-grained control over flushing, for example if you want to store multiple bitfields or other pieces of data on the same database.

bf.has(x, cb)

Test whether x is in the set. cb(err, exists) is called with a boolean exists or an error err.

bf.rank(x, cb)

Get the number of elements i in the set that are less than x as cb(err, i).

This corresponds to searching for the number of 1s < x in the bitfield, so the alias is called rank1.

aliases:

  • bf.rank1(x, cb)

bf.rank0(x, cb)

Get the number of non-negative integer elements i NOT in the set that are less than x as cb(err, i).

This corresponds to searching for the number of 0s < x in the bitfield, so this method is called rank0.

bf.select(i, cb)

Find a value of x where rank(x) = i as cb(err, x).

If there is no element x in the set where rank(x) = i, you will receive the value -1.

aliases:

  • bf.select1(i, cb)

bf.select0(i, cb)

Find a value of x where rank0(x) = i as cb(err, x).

If there is no element x in the set where rank0(x) = i, you will receive the value -1. Note that zeros extend unbounded past the largest value in the set (the last 1 in the bitfield).

bf.predecessor(x, cb)

Return the greatest element y in the set where y < x as cb(err, y).

aliases:

  • bf.predecessor1(x, cb)
  • bf.pred(x, cb)
  • bf.pred1(x, cb)
  • bf.prev(x, cb)
  • bf.prev1(x, cb)

bf.predecessor0(x, cb)

Return the greatest non-negative integer element y NOT in the set where y < x as cb(err, y).

aliases:

  • bf.pred0(x, cb)
  • bf.prev0(x, cb)

bf.successor(x, cb)

Return the smallest element y in the set where y > x as cb(err, y).

aliases:

  • bf.successor1(x, cb)
  • bf.succ(x, cb)
  • bf.succ1(x, cb)
  • bf.next(x, cb)
  • bf.next1(x, cb)

bf.successor0(x, cb)

Return the smallest non-negative integer element y NOT in the set where y > x as cb(err, y).

aliases:

  • bf.succ0(x, cb)
  • bf.next0(x, cb)

var multi = new MultiBitfield(storage)

Create a new multi instance to open multiple bitfields on the same database under different prefixes with bound flushes. This way you can move membership between different sets on the same underlying database flush().

var bf = multi.open(prefix, opts)

Open a bitfield database stored under prefix.

You should take care to delimit your prefixes to prevent clashes, for example by using a trailing delimiter such as ! and disallowing ! from the prefix area if it comes from user input.

multi.close(prefix)

Close a bitfield under prefix. Flushes will no longer include writes from this database.

multi.flush(cb)

Flush all of the open databases in a single underlying database flush.

license

license zero parity and apache 2.0 (contributions)

Package Sidebar

Install

npm i bitfield-db

Weekly Downloads

3

Version

1.2.3

License

Parity-6.0.0 AND Apache-2.0

Unpacked Size

77.4 kB

Total Files

24

Last publish

Collaborators

  • nopersonsmodules