arachne
arachne is a utility for scripting connected subgraphs.
Databases frequently contain multiple independent data sets, especially in
multi-tenant systems where client data are stored in the same tables but have
no relation to each other. Extracting a single data set is more difficult the
more complex the schema, and automated tools like pg_dump
are all-or-nothing
propositions.
Given a database, a table to start from, and the value of a primary key in that
table, arachne crawls the relationship graph to determine the dependency
structure and exports the data in a safe order so no row is inserted before
data it depends on. The output format is similar to that of pg_dump
.
Requirements
Certain criteria must be met in order for arachne to analyze and script a complete dataset:
- All foreign keys must be explicitly defined.
- All tables must have primary key constraints, including junction tables.
- Tenant data must be fully segregated, such that there is no row in any table referenced by data belonging to multiple tenants.
- There must be no cyclic dependencies (where two rows depend on each other, an impossibility so long as referential integrity is maintained).
Installation
npm i -g arachne
Usage
As an application:
arachne --database mydb mytable 1234
As a module:
const arachne = require('arachne');
const config = {}; // pg connection configuration
const criteria = {id: 1234};
arachne(config, 'sourcetable', criteria).then(script => {...});
Note that with complex schemata and large tables it may take some time to traverse and script everything!