At its most basic level, dbsync is a command line tool that scans a path for files to run as SQL migrations. Migrations found which have not been (successfully) run before will be run in ascending order based on path/filename. Various options can be used to change those basic behaviors.
Version 2.0 is a small breaking change to handle problems handling migrations that contain a
? character. For most
users this should be transparent, but if any users hit that problem before and worked around it with manual escaping,
that manual escaping is no longer necessary (and in fact could leave your escaping in the migration).
dbsync supports any database that Knex.js does. For now, that means:
- and maybe Oracle (mentioned in the Knex docs as experimental support)
The db "migration" concept
Any time a change in the code requires a change in the db structure, a transformation of the data, and/or a small amount of new data to be inserted, it should happen as a SQL "migration". The following bullets describe migrations as they are generally used, and the default behaviors for dbsync; however, all of those behaviors can be altered via the command line options described later.
- Generally, once a migration has been applied to a given database, it will not be applied again to that same database.
- Migration files are not inspected for changes; if the filename is the same as a previously successful migration, it will not be run again.
- A migration file will either be applied in its entirety or not at all. If a migration is not applied (due to an error), no more migrations will be attempted, and the failed migration will be attempted again on next execution.
- Migrations are ordered; they are applied in alphabetical order by filename, including path.
Using dbsync to perform migrations
dbsync should be scripted to occur as part of application deploy and/or startup, and so should rarely need to be initiated independently.
Data should almost always be inserted via INSERT statements, usually with column names, e.g.
INSERT INTO table_name (col_name_1, col_name_2, ...) VALUES (value1, value2, ...);
Avoid inserting values into columns that can be auto-generated (like serial ids) – let the column's sequence assign the value. If these auto-generated ids need to be used in SQL somewhere else in the migration, it's still best to let the value get auto-generated, and then just select it later.
Command line use
In order to use dbsync as a stand-alone utility, you will need to install the appropriate database node client module (see the Knex.js documentation for supported clients), either locally wherever dbsync will be used, or globally.
Below are the details of command line options available for dbsync (the same help block can be displayed by running
--help, -h Show help
--path, -p Directory to scan for migration files; this part of the file path
will not be used when determining whether a migration has been run
--client, --db A db client string, as appropriate to pass to knex in the
initialization object: http://knexjs.org/#Installation-client
--connection, --conn Additional db connection options, as appropriate to pass to knex in
the initialization object: http://knexjs.org/#Installation-client In
order to set subproperties, use dot notation as in these
--client=mysql --connection.user=usr --connection.host=localhost
--client=sqlite3 --connection.filename=./mydb.sqlite [required]
--table, -t Table name for tracking migrations [default: "dbsync_migrations"]
--files Glob pattern used to filter which files in the path are treated as
migrations. May be specified multiple times, in which case files
matching any of the globs will be treated as migrations.
--encoding, -e Encoding to use when reading files. [default: "utf8"]
--case-sensitive, -c If set, the glob pattern will be matched against files as a
--recursive, -r If set, subdirectories within the path will be traversed and searched
for migrations matching the filter pattern; this option is ignored if
the files glob option contains a slash character.
--order, -o Governs whether migrations are ordered based on the file's basename
alone, or the full file path; must be one of: basename, path
--logging, -l Logging verbosity; must be one of: debug, info, warn, error, silent
--on-read-error Governs behavior when an unusual directory read error is encountered;
must be one of: ignore, log, exit [default: "exit"]
--test If set, instead of performing migrations, dbsync will simply log any
messages about actions it would have performed normally.
--autocommit If set, the commands in the migrations will be run and committed
individually, rather than wrapping each migration inside a single
transaction; this is useful if you want to manually manage multiple
transactions within a migration, or if you want to execute commands
not allowed within a transaction (like DROP DATABASE). This option
conflicts with --migration-at-once and --one-transaction.
--migration-at-once If set, dbsync will not count lines or commands, but instead will
load each migration entirely into memory and pass it to the db at
once. This will keep dbsync from processing the text of the
migration, but might require a lot of memory for large migrations.
This option conflicts with --autocommit.
--one-transaction, -1 If set, dbsync will run all the migrations as a single transaction,
rather than one transaction per migration. Migration table
initialization, and updates to the migration table for pending and
failed migrations will not be executed as part of this transaction.
This option conflicts with --autocommit.
--forget If set, dbsync will not record any migrations that it performs during
this run, nor will it create the migrations table if it doesn't yet
exist, but it will still refuse to run migrations that have succeeded
previously (unless --blindly is also used); this is useful for
scripting misc db commands without requiring any additional client
tools to be installed.
--blindly If set, dbsync will not restrict the migrations performed to only
those that have not run successfully before; this is useful for
rerunning previously-run scripts during development, or for scripting
misc db commands without requiring any additional client tools to be
--reminder Some environments where dbsync may run (e.g. CircleCI) require
periodic output to ensure the process is running properly. Sometimes
a migration could take a long time, but without this indicating a
problem. In such cases, you can use --reminder to request reminder
output every X minutes (a best-effort attempt is made, so to be safe
you should set a lower value than you really need); a value of 0
disables reminder outout. Value must be non-negative, but can be a
decimal (e.g. --reminder .5) [default: 0]
--dollar-quoting If set, this will force dbsync to allow dollar-quoted strings as
specified by PostgreSQL. The negation, --no-dollar-quoting, is also
available to force quoting to be turned off (this may make a minor
improvement to resource usage by dbsync). The default value for this
option is true when --client is set to 'pg', and false otherwise.
This is not relevant when --migration-at-once is also used. For more
details on dollar-quoting, see section 18.104.22.168 (Dollar-quoted String
--command-buffering This sets the number of SQL commands to buffer before pausing reading
from the file; must be a positive integer. This is a
performance-tuning option and shouldn't need to be altered for most
use cases. [default: 4]
--stack-traces If set, stack traces will be logged with any errors (when present).
dbsync may also be used programmatically as a node dependency. All the features available through the command line are still available when used this way, plus some additional flexibility not available from the command line.
require("dbsync") returns the Migrator class; you instantiate a Migrator instance with something like:
var Migrator = require("dbsync");
var migrator = new Migrator(options);
options is an object containing keys just like the argument options described above, subject to the caveats
--helpcommand line option has no equivalent in the options object.
- Options required for the command line are also required for the options object, except that
pathis not required if dbsync will not perform migrations based on files.
- Where multiple forms of an argument option are available for the command line, only the first listed works for programmatic use.
- Options that do not take values will be interpreted as booleans; this is, if the option is given a truthy value, it will be turned on.
- For options that take values, the same allowed values and defaults apply.
- Options allowing dot notation on the command line (such as
--connection) correspond to a nested object
- Multiple uses of an option (such as
--files) correspond to an array of values
dbsync --client=mysql \
will use options equivalent to the following options object:
files: ['*.sql', '*.pgsql'],
"stack-traces": false // all booleans (except 'dollar-quoting') default to
// false, so this is the same as not specifying it
Note that invalid options passed to the Migrator constructor could result in thrown errors.
Additional options flexibility
In addition to the options allowed from the command line, the following are also available:
logging: passing a falsy value is equivalent to 'silent'
logging: passing an object with functions for its debug, info, warn, error, and log properties will cause dbsync to use the passed object for logging
Using a Migrator instance
A Migrator instance exposes 4 methods and 1 property of interest:
migrator.executionContextis an object which may be inspected at any point during or after execution to get information about the execution of the migration set. (This object should never be modified or replaced while a migration set is running.)
migrator.scanForFiles()returns a promise of an array of strings representing files (relative to the
pathoption) which will be considered for migration, sorted in the order they will be considered according to the
orderoption. Files representing migrations which have already successfully executed will not already be filtered from this list.
migrator.shouldMigrationRun(migrationId)returns a promise of a boolean representing whether the migration needs to run based on the migrator's options (so this function will always yield
blindly: trueis set). The
migrationIdfor a file-based migration is the file's name, including path relative to the
pathoption (as returned by
migrator.doAllMigrationsIfNeeded()is what is used to execute migrations based on a command line invocation. If
executionContext.filesis falsy, it will be set with the value returned by
executionContext.fileswill be used as its list of migration files. This means it is possible to set a custom list of files (or a custom ordering of files) by setting
doAllMigrationsIfNeeded(). This method returns a promise which resolves to
executionContextwhen all the migrations in the list have been skipped (based on the result of
shouldMigrationRun()) or have succeeded, or when a single migration fails.
migrator.doSingleMigrationIfNeeded(migrationId, [migrationSource])is similar to
doAllMigrationsIfNeeded(), but only performs a single migration (or skips it, based on
shouldMigrationRun()), and has a number of advanced behaviors available based on
migrationIdis treated as a filename to load as the migration; otherwise,
migrationIdshould be any user-defined string id which will identify this migration.
migrationSourceis a string, the string will be used as the content of the migration.
migrationsSourceis an instance of
stream.Readable, the data from the stream will be used as the content of the migration. Note that the stream must return strings, not buffers. (If you have a stream returning buffers, you can make it return strings by calling
migrationSourceis a promise (or then-able) which resolves to a string or a readable stream, the resolved value will be used as described above.
migrationSourceis a function, the function will be called and its return value (which must be a string, readable stream, or promise to one of those) will be used as described above. Note that the function will only be called if
migrationId; this makes it useful for migrations that require some effort/time/resources to set up, such as creating a stream that downloads a file from a server. By putting such setup in a function and passing that as the
migrationSource, the download will never be initiated unless dbsync intends to execute the migration.
Multiple Migrator instances may be created and used in parallel. Since migrating a database is conceptually a serial operation, this should (probably) not be used to perform parallel migrations on the same db, but only to perform migrations on multiple dbs in parallel.
Contributions to this project are welcome; please create an issue ticket for bugs or feature requests, and submit a PR if you have made improvements. Feature requests submitted without a quality PR may not be implemented quickly.